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Abstract 

'A 

The  traditional  recurrence  for  the  computation  of  exponential  divided 
differences,  along  with  a  new  method  based  on  the  properties  of  the  exponen¬ 
tial  function,  are  studied  in  detail  in  this  paper.  Our  results  show  that  it  is 
possible  to  combine  these  two  methods  to  compute  exponential  divided 
differences  accurately.  A  hybrid  algorithm  is  presented  for  which  our  error 
bound  grows  quite  slowly  with  the  order  of  the  divided  difference. 
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Introduction 


We  need  accurate  divided  differences  for  computing  certain  functions  of 
matrices  f  (A)  by  means  of  the  Newton  interpolating  polynomial  (cf.  section 
6): 

/  (a)  =  *?/•/  +  nfAf/  •  ftu-V). 

k=l  ;= l 

where  A*  stand  for  the  divided  differences  of  /  on  the  eigenvalues  of  A.  One 
can  evaluate  /  (A)  by  computing  first  the  divided  differences  and  then  accu¬ 
mulating  the  polynomial.  The  divided  differences  must  be  of  high  relative 
accuracy  because  they  are  the  coefficients  of  products  of  matrices  which,  in 
some  cases,  have  very  large  norm.  What  makes  such  accuracy  possible  is 
that  the  divided  differences  are  not  for  arbitrary  smooth  functions  /  but  for 
well  known  analytic  functions  such  as  exp,  sin  and  cos.  Thus  we  can  exploit 
their  properties  in  the  computation. 

In  this  paper  we  restrict  our  attention  to  exponential  divided 
differences.  A  new  technique,  namely  argument  reduction  for  matrix 
exponentials,  makes  it  realistic  to  consider  data  sets  with  imaginary  parts 
bounded  by  jr  in  magnitude.  Based  on  this  an  algorithm  is  presented  for 
which  our  error  bound  grows  quite  slowly  with  the  order  of  the  divided 
difference. 

We  begin  by  collecting  together  a  considerable  amount  of  information  on 
divided  differences  and  we  hope  that  there  will  be  other  applications  for 
accurate  divided  differences  of  well  known  functions. 
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1.  Basic  Notation  and  Theorems 


1.1.  Definition  of  Divided  Difference 


Following  McCurdy  [1980],  we  will  use  an  uncommon  but  compact  notation 
for  divided  difference.  For  completeness  and  simplicity  we  use  the  contour 
integral  representation  to  define  the  divided  differences.  Our  attention  will 
be  on  the  basic  properties  (1.2.1),  (1.2.2)  and  (1.2.3)  given  in  section  1.2. 


Let  /  be  a  holomorphic  function  defined  inside  and  on  a  simple  closed 

contour  C  enclosing  the  sequence  Z=[f  i.(z . fn,..]  of  complex  numbers. 

Z  denotes  the  abscissae  (or,  for  those  who  do  not  like  Latin,  data  points  or 
nodes,  or  even  knots).  We  use  to  denote  the  fc-th  order  divided 

difference  of  /  on  . For  any  integer  i  >  0  ,  the  fc-th  order 

divided  difference  A*/  on  Z  is  defined  (following  Gel’fand)  to  be 

**  -  *z)t  * 


The  superscript  of  A*/  denotes  the  order  and  the  subscript  denotes  the 
starting  point  in  Z.  Reference  to  the  abscissae  Z  is  usually  suppressed. 


Remark  1.  An  alternative,  and  more  elementary  definition  (used  in  Conte 
and  de  Boor[1980],  cf.  p.40)  designates  Af/  as  the  coefficient  of  xk  in  the 
unique  polynomial  of  minimal  degree  which  interpolates  /  at  1. 

Remark  2.  Milne-Thomson  [1933]  writes  Af/  as  [ti-fi+i.  ....  ft**],  suppress¬ 
ing  the  function  while  de  Boor  considers  . (i+*l  as  a  linear  func¬ 
tional  whose  value  on  /  is  written  [(vCi-n . ?»«•*]/•  Davis  [1973]  uses 

. £<♦*):  some  others  like  Atkinson  [1978] 


use 
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/[{i-fi+i . fi**]  while  Kahan  and  Farkas  [1963]  and  Gabel  [1968]  use 

&f(ti'ti  + . ti*k)<  which  suggested  the  compact  notation  used  here. 

Vfuch  of  this  introductory  section  is  taken  from  the  thesis  of  McCurdy[l980]. 


1-2.  Basic  Properties  of  Divided  Differences 

Let  denote  the  fc-th  derivative  of  f .  From  basic  complex  analysis  one 
can  deduce  from  (l.  1.1)  that 

(1.2.1)  A if  does  not  depend  on  the  order  of  ti>ti*i . &+*  in  Z. 

(1.2.2)  if  ti*ti+k.  then  A*/  =  *f  , 

Ci+Jk  Ci 

f  (*)/>  \ 

(1.2.3)  if  ti=ti+i=  '  ti+ie‘  then  A »*/  = — 1 in  particular  A *0/ =/({*)• 

Most  definitions  for  divided  difference  are  based  on  (1.2.  l).(  1.2.2)  and 

(1.2.3) .  Thus  our  definition  agrees  with  them  when  the  function  is  holo- 
morphic.  In  this  paper  /  will  be  holomorphic. 


1.3.  Integral  Representation 
Theorem  (Hermite-Gennochi). 

t  *1  ■'•-i 

■■■■i-{titk-titk-i)vk]dv).  -du2dvl.  (1-3.1) 

0  0  0 

Proof.  See  Gel’fand  [1971]. 

Corollary 

1  &if  I ^ -^j- max 1  / 1 .  (1.3.2) 

where  T5  is  the  convex  hull  of  ft.  ...  .ft**. 
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1.4.  Mean  Value  Representation 

For  read  abscissae,  (1.3.1)  implies  that  there  exists  some  such  that 

=  d.4.1) 


One  might  hope  to  generalize  this  representation  for  complex  abscissae  by 
requiring  rj  to  lie  in  the  convex  hull  of  the  abscissae,  but  this  will  not  suffice, 
as  is  easily  seen  by  the  following  example  : 

Example  I.  fj=l,  <2=2.  /(f)=exp(27rif). 

Ai7  =  S-gZj - -  O  +  fM  (1.4-2) 

for  any  finite  rj. 


In  the  above  example,  if  we  require  both  abscissae  to  lie  in  f's  funda- 

(note  that  f  (£+n)=/  (£)  for  any  integer 


mental  domain  :  Re(f)e[-^-,  ^-) 


n),  then  the  best  we  can  have  is  that  there  is  some  rj  close  to  their  convex 
hull  for  which  (1.4.1)  holds.  The  next  example  illustrates  this  property. 


Example  II.  fi=f.  =— f ,  t  is  a  small  non-zero  real  number. 


(1-4.3) 


for  any  real  rj. 


1.5.  Matrix  Representation 

The  traditional  way  of  computing  uses  the  divided  difference  table.  Each 
divided  difference  is  computed  from  its  two  immediate  neighbors  in  the 
column  to  its  left  (use  (1.2.3)  for  coincident  abscissae  and  (1.2.2)  for  the 
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f«  i) 

A  }f 

<2  n<z) 

A  If  * 

•  a  r1/ 


An-i/ 

fn  /(fn) 


For  our  purposes  it  is  more  helpful  to  arrange  the  table  as  an  upper  triangu¬ 
lar  matrix,  for  example 


A/  = 


ftti)  A//  •  •  Af1/ 
/to)  •  a rV 


/(fn) 


(1.5.1) 


The  symbol  A/  =A(Z)/ ,  without  the  superscript  and  subscript,  is  used  here 
to  represent  a  matrix,  not  a  scalar.  Let  Zn  be  the  special  nxn  bidiagonal 
matrix  associated  with  the  ordered  set  Z 


Zn 


<1  1 
<2  1 

•  1 


(1.5.2) 


Theorem  (Opitz)  The  divided  difference  table  is  a  matrix  function 

A/=/(Zn).  (1.5.3) 

Proof.  See  McCurdy  [1980]  or  Opitz  [1964]. 


Remark.  Opitz  [1964]  first  obtained  the  result  but  his  paper  is  little  known  in 
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the  U.S.A.  and  is  in  German.  McCurdy  rediscovered  it  in  1979  when  working 
on  his  thesis. 


1.6.  Our  Objective 

Given  any  Z=[%u$ . Cn].  can  we  compute  Afexp  for  &=0,1 n— 1  with 

guaranteed  high  relative  accuracy?  Using  the  matrix  representation,  it  is 
equivalent  to  ask  "  Can  we  compute  the  first  row  of  Aexp,  or  exp (Zn)  ,  accu¬ 
rately  ?  "  The  answer  is  affirmative  if  the  abscissae  are  close  to  the  real  line. 

In  the  next  two  sections,  we  discuss  some  basic  and  hybrid  methods  for 
computing  Aexp.  In  section  4  we  give  the  results  of  McCurdy  [1980]  for  real 
abscissae  Z,  wliich  show  that  one  can  compute  Aexp  accurately  in  all  cir¬ 
cumstances.  We  turn  to  the  complex  case  in  section  5  and  show  that  in  cer¬ 
tain  cases  the  problem  is  "difficult"  (to  be  precise,  certain  sets  Z  give  unex¬ 
pectedly  small  values  for  A^exp,  and  we  call  them  ■'difficult”).  For  difficult  Z, 
we  cannot  expect  high  relative  accuracy;  the  situation  is  like  approximating 
zero  by  some  non-zero  number.  Finally'  in  section  6,  we  discuss  the  applica¬ 
tion  of  the  divided  differences  to  matrix  exponentials. 
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2.  Basic  Methods  for  Computing  Exponential  Divided 
Differences 


2. 1.  Standard  Recurrence 


Yfhen  all  {*s  in  Z  are  distinct,  we  can  use  the  well  known  recurrence  scheme 
(1.2.2)  to  compute  the  divided  differences  table  A/:  A*/  for  i> 0,  fc&O  and 
k  -hisSn. 


SR  (Standard  Recurrence  scheme)!. 


if/  = 


for  each  A:  =  1,2 n.  andi=l,2,...,n-fc,  where  A?/  =f  (fi).  ■ 


(2.1.1) 


SR  is  probably  the  simplest  algorithm.  It  takes  only  nz/  2  +  0{n)  arith¬ 
metic  operations  to  fill  up  the  whole  of  A /  when  all  data  in  Z  are  distinct. 
However,  when  some  /  (ft)  are  close  together  and  given  to  limited  p-ecision  , 
it  may  produce  enormous  relative  error.  For  example,  consider  the  exponen¬ 
tial  function  on  data  [l,  1.000 1],  Assume  function  values  given  to  8  decimal 
digits,  then 


_i*exp= 


2.7185533  -  2,7182918 

1.0001  -  1 


=  2.7200000  {Arts.  2.71841777...). 


Four  digits  have  been  lost  during  the  subtraction  (which  is  performed 

exactly!).  Notice  that  the  loss  doesn't  depend  on  the  number  of  digits  carried 

by  the  function  values.  The  first  four  digits  of  the  function  values  agree, 

therefore  four  digits  will  be  lost  no  matter  how  many  digits  are  given.  Since 

the  higher  order  difference  of  exp  behave  like  exp  (because  the  derivative  of 

!  Parlett's  Recurrence  for  computing  f{Zn)  (Parlett[  1975])  is  identical  to  the  standard 
iterative  scheme  for  computing  A/.  The  technique  is  based  on  the  commutativity  of  Zn  and 

f  {*„>:  V/CJW-ZCV V  cf-  ParleUt  19793- 
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exp  is  exp),  we  would  expect  A”exp  to  lose  digits  if  the  data  are  as  close 
together  as  in  the  example.  Consequently  when  only  12  or  16  decimals  are 
available  it  is  quite  possible  to  lose  them  all  for  higher  divided  differences! 

If  the  tabular  values  are  the  only  data  then  there  is  no  simple  escape 
from  this  loss  of  information.  That  is  why  divided  differences  have  a  bad 
name  in  practice.  However  in  a  number  of  applications  the  functional  form 
of  /  is  known  (e.g.  exp)  and  can  be  exploited  to  obtain  accurate  values  in 
this  situation.  This  is  the  essential  point  of  our  paper. 

We  shall  suppress  the  reference  to  exp  or  Z  in  the  exponential  divided 
differences  when  it  can  be  done  without  amb.guity.  Thus  A*5,  A  *(Z)  and  A,texp 
may  all  mean  &i(Z)exp. 


2.2.  Special  Formula  for  The  First  Divided  Difference. 

If  the  sine  function  for  complex  arguments  is  available  and  fully  accurate 
then  we  have  a  reliable  formula  for  the  first  divided  difference.  Let 
»B(f<+i+ft)/2  and  2,  then 

A..1  -  e^*1 2 3 4 5  -ef*  _  Ut  e*  -  e~*  _  u  sinh(^)  _  sin(i^) 
ft+i-f  ^-(-^)  y 

If  ^=0,  we  set  A*  =  eft. 


Function  FDD(x.y)  (First  Divided  Difference). 

Given  complex  data  x ,y ,  FDD  will  return  the  value  of  Aj1  ([r.y  ]). 

1.  «=(y +a:)/  2  (or  «=z+(y-x)/2  when  x  and  y  are  close  together), 

2.  if/  =u-x, 

3.  if  ^=0  then  FDD=e*, 

4.  if  if/^Q  then  FDD =e“'Sin(t^)/(i^)  . 

5.  Return.  ■ 
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We  will  use  that  function  freely  in  th<.  rest  of  this  paper  for  our  algo¬ 
rithms  because  FORTRAN  library  functions  have  improved  greatly  since  1970 
in  most  main  frame  computers. 


2.3.  Taylor  Series 

Another  simple  way  to  compute  A  is  by  its  Taylor  series 

Zz 

A  =  Aexp  =  exp (Zn)=/+Z„  +  -^-+  -  . 

Because  of  the  special  structure  of  Zn ,  there  is  an  extremely  elegant  algo¬ 
rithm  for  the  first  row  of  the  matrix  A.  Explanation  is  given  in  Appendix  A. 
This  approach  does  not  apply  when  f  is  known  only  by  its  values  on  Z. 


Algorithm  TS  (Taylor  Series). 

Given  Z  as  in  section  2.1,  this  algorithm  computes  [d(l),rf(2) . d(n)]:= 

[Af.  --.Ap~1]  by  Taylor  series.  In  what  follow,  k  indicates  the  current  loop 
number,  and  s(i)  stores  the  (l.i)-th  element  of  matrix  CZ-n)fc+l-V(&+i-l)! 

TSl.  [Initialize.]  Set  d(i)=s(i)=l/(i— l)!  fori=l,2 n. 

TS2.  [Loop.]  For  fc=l,2,...  until  convergence  do 

TS2.1  s(l)«-{ys(l)/Jfc 
TS2.2  For  i=2,3,...,n  do 

s('i)<-[<ysC0+s('i-l)]<',(* 

d(i)«-d(i)+s(i). 

TS3.  Set  c£(l)=  exp(fi)  and  the  algorithm  terminates.  ■ 


Algorithm  TS  computes  only  the  first  row  of  A.  If  one  wants  the  whole 
divided  differences  table,  one  has  to  use  the  following  TS(II),  which  essen¬ 
tially  computes  the  whole  A  by  repeating  TS  on  the  submatrices  in  Zn. 
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'  Algorithm  TS(II)  (Taylor  Series  algorithm  (II)). 

Given  Z  and  a  matrix  array  F.  this  algorithm  computes  F=&  by  Taylor 
series.  In  what  follow,  k  indicates  the  current  loop  number,  and  F(i.m). 
i>m,  stores  the  (i,m.)-th  element  of  matrix  (Zn)fe+t_m /  (k  +i-m.)!  . 

TS(II)l.  [Initialize.]  Set  F(i,m)=F(rn.i)=l/ for  1  smsisn. 

TS(II)2.  [Loop.]  For  A:  =  1.2,...  until  convergence  do 

TS(II)2.1  For  m=l,2,...,u-l  do 

F(m ,ttl) F(m , m )/  k , 

for  i=77i+l . n  do 

(fc+i-m), 

F(jn  ,i )  *-F{m  ,i ) +F(i,m  ). 

TS(II)3.  For  m  =  1,2 . n  set  F(jn,m)~  exp(<fm)  and  restore  zero  to  the  lower 

parts  of  / .  Le..  F(i,m.)«-0  for  0<7tl  <i,  and  the  algorithm  terminates.  ■ 

Accuracy.  TS  method  is  fast  and  accurate  only  when  all  ft  are  close  to  zsro. 
Let  ysrruuclfl  and  call  it  the  "radius"  of  Z.  Numerical  examples  show  that 

when  the  radius  is  bigger  them  2  or  3,  TS  may  not  be  reliable.  The  situaticn  is 
like  computing  by  its  Taylor  series,  Le.,  by  l-y+^-+--.  In  finite  preci¬ 
sion  arithmetic  .  when  y  is  large,  then  e*7  is  small  and  the  roundoff  ei*ror 
from  the  intermediate  term  (which  is  large)  could  impair  the  accuracy 

of  the  series.  If  one  wants  the  roundoff  of  the  intermediate  terms  to  have  no 
serious  effect  on  e""’’,  say,  confined  to  the  last  binary  digit  of  e"'5',  then  y  must 

be  small  enough  so  that  e-7  s2  ma.'t(^-),  which  implies  y  ^ln2  «  0.7.  It 
seems  reasonable  to  require  7<0.7  if  one  wants  TS  to  yield  accurate  answers. 
Criterion.  Use  TS  when  y  is  less  than  0.7. 

This  criterion  will  be  used  throughout  our  paper,  for  we  need  TS  to  yield 
accurate  answers  in  the  Scaling  and  Squaring  method  in  section  2.4.  One  may 
relax  the  constant  0.7  a  little  bit  but  we  will  stick  on  this  value.  Our  examples 
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(cf.  TabLe  2.3.3)  show  that  the  error  grows  rapidly  with  7  and  it  becomes 
unbearable  when  7  is  bigger  than  2  or  3. 

Remark.  Hie  number  of  terms  £  needed  in  the  series  depends  on  the  radius 
7  and  the  machine  precision  e.  In  appendix  A  we  show  that  in  the  presence  of 
roundoff  it  is  sufficient  to  chose  £  such  that 

2  2r  * E-  (2.3.1) 

i=i+ 1  3- 

For  example,  if  7=0.7.  then  for  s=2-24,  £=9  and  for  e=2-56,  £  =  16. 

Operation  count  and  storage.  The  operation  count  is  2£n  for  TS  and  £n2  for 
TS(II).  where  £  is  the  number  of  terms  needed.  Two  working  n-vectors  are 
required  for  storing  cf  and  s  in  TS  while  a  whole  matrix  is  needed  in  TS(II). 

Numerical  example.  Let  u  -  cos(O.l)  +  isin(O.l).  We  ran  on  a  Vax  1  l/730r 
TS  using  single  precision  (e=2-24)  on  the  set 

Z=[-7u.-7u . -7 u.yu. . yu]*  (2.3.P.) 

with  different  values  of  7  and  71,  where  n  is  the  number  of  points  in  Z.  Here 
7  is  also  the  radius  of  the  data  because  ]u|=l.  Our  results  are  summarized 
in  the  following  table.  Each  entry  in  Table  (2.3.3)  is  the  maximum  magnitude 
of  the  relative  errors  in  A(Z)  as  a  multiple  of  t**.  Note  the  rapid  growth  of 
the  error  as  7  Increase. 


*  Vox  is  a  trademark  of  the  Digital  Equipment  Corp.. 

To  be  precise,  if  2  has  n  point*,  then  the  first  [ 1  ■ }  are  -yu  and  the  other  are  70. 

Thus  the  number  8240  corresponding  to  n  =28  and  7=9.2  means  that  the  maxi  mum  relative 
error  in  is  6240c. 
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71=5 

H 

H 

11 

fi 

3 

II 

-^1 

n=23 

n=29 

7=0.7 

1.3 

2.3 

L9 

4.0 

2.7 

7=1.2 

1.9 

4.9 

8.8 

8.8 

8.8 

7=1.7 

3.8 

7.9 

14.5 

10.6 

30.7 

7=2.2 

38.7 

38.7 

38.7 

38.7 

38.7 

7=2.7 

39.0 

59.7 

122.0 

122.0 

122.0 

7=3.2 

28.7 

96.1 

133.0 

159.0 

159.0 

7=3.7 

291.0 

321.0 

321.0 

484.0 

484.0 

7=4.2 

704.0 

1820.0 

1820.0 

1970.0 

1970.0 

7=4.7 

2250.0 

2250.0 

2300.0 

2300.0 

2980.0 

y=5.2 

2980.0 

5530.0 

8240.0 

8240.0 

8240.0 

Table  (2.3.3).  Max  relative  error  coefficient 
in  Z  (cf.  2.3.2)  with  different  n  and  7. 


2.4.  Scaling  and  Squaring 

2.4.1.  SS  (Scaling  and  Squaring)  method. 

When  the  abscissae  are  not  close  enough  for  TS.  we  can  shift  and  scale  down 
the  size  of  Zn  by  .  for  example,  setting 

Yn  »  -77/). 

where  k  and  77  are  chosen  so  that  Yn  has  small  diagonal  elements.  Since  exp 
has  the  following  properties 

(i)  exp(i4+x/)=e*  exp(i4) 

(ii)  expC^-yl^sexpU). 

we  can  recover  exp(Zi»)  from  F=exp(Yn)  by  exp (Zn)  =  e^fexpCl^)]2*.  The 
matrix  power  can  be  computed  by  repeated  squaring  of  F  (i.e.  F  *-  F*)  k 
times. 

Fotcr  major  steps  for  SS: 

step  1.  Determine  77*  and  k  so  that  Y*={Zn-r\I)/&  has  radius  ss0.7*. 

T  We  usually  use  the  arithmetic  means  of  the  data  as  the  shifting. 

*  The  number  0.7  comes  from  the  criterion  in  section  2.3.  It  is  proved  to  be  almost  the  best 
for  SS  in  UcCurdy[1980]  when  7  is  real. 
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step  2.  Compute  F=exp(Yn)  by  Taylor  series, 
step  3.  ( F  «-  Fz)  k  times, 
step  4.  Shift  back  F:  F  ev  F. 

The  squaring  in  step  3  normally  requires  fcn3/  6  +  n2/  2  +  n/3  operations  (/* 
is  triangular)  and  a  matrix  storage  for  F;  this  is  quite  expensive  when  rt  is 
large.  However,  there  is  an  alternative  method  which  requires  only 
Jfcn2  +  0(1)  operations:  with  some  modification  of  steps  2  and  3,  one  can 
replace  every  "intermediate"  F  by  some  divided  differences  table  (  notice 
that  in  step  2 


2~fcfi  2-* 


and  does  not  generate  a  divided  differences  table).  Consequently  with  the 
backfilling  technique  in  §2.4.2,  one  can  generate  the  whole  matrix  F  from  its 
first  row  and  therefore  only  the  first  row  is  needed  in  the  squaring,  thus 
reducing  the  operations  and  storage  required.  This  method  does  sacrifice 
some  accuracy,  however.  Before  presenting  the  algorithms  (in  §  2.4.4),  we 
describe  the  backfilling  technique  and  discuss  a  subtle  modification  of  steps 
2  and  3.  In  general  we  cannot  avoid  using  a  2-dimensioned  array  to  form  F2 
unless  F  has  some  special  structure. 


2.4.2.  Back  filling  the  divided  difference  table 
Consider  again  the  divided  difference  table  A/ 
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A/'= 


ftti)  A,1/ 

fUz) 


Ap-7 

A?'2/ 


/(fn) 


Algorithm  SR  shows  that  A /  can  be  generated  from  its  diagonal  elements. 
However,  it  is  also  true  that  A /  can  be  generated  from  its  first  row  by  the 
formula  (2.1.2):  given  A?,Aj.*".Af“l. 


A*/  =  (ft+*  -  <t-,)-At*_Y/  +  A **_!/  (2.4.2. 1) 

for  i=2,3 . n  and  *=0,1,2...., n-i. 


Tlie  only  worry  in  using  formula  (2.4.2. 1)  is  the  propagation  of  the  error 
in  A */ .  which  may  be  serious,  especially  when  the  {*s  are  far  apart.  When 
/=exp  and  Z  is  real  and  in  natural  order  (&<£/  for  i<j)  .  (2.4.2. l)  is  reliable 
because  all  Apexp  and  ((■<+.*  —  C<-j)  are  positive,  and  summing  positive 
numbe.rs  is  quite  stable.  Thus  bad  situations  occur  only  when  Z  has  large 
variation  in  the  im«iginary  parts.  In  this  case  the  backfilling  step  frequently 
exhibits  instability.  The  following  is  a  typical  example. 

Numerical  example.  Let  Z=[-24i,~21i,— 18i . lBi,21i,24i].  We  compute 

the  last  column  of  A(Z):  for  *  =  1,2,.. .,18  by  backfilling  and  compare  it 

with  the  correct  answer.  The  last  column  of  the  table  denotes  the  magnitude 
of  the  relative  errors  in  the  corresponding  divided  difference  A£  ~fc. 
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Correct  values 
to  six  digits 


Backfilling 


rel.  error 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 


.699024e-16 

.118971e-15 

-.375574e-13 


.000000e+00) 

.167766e-14) 

.535368e-141 


168358e-12  -.780?34e-12) 
.149915e-10  -.436262e-ll) 


.976633e-10 
-  42463 le-08 
-.333270e-07 
.800392e-08 
.678838e-05 

-.912472e-04 

.761200e-03 

.538045e-02 

.390049e-01 

(-.121109 

(-.580745 


.26427Be-09) 
.  192067 e-08) 
.616516e-07) 
.508937e-06) 
.917160e-05) 

.781070e-04) 
.771374e-03) 
•.611926e-02) 
.296790e-01) 
.184993  ) 

.323969  ) 


.699024e-16 

.118971e-15 

.375574e-13 

.169359e-12 

.149915e-10 

.976634e-10 

.424635e-03 

.333270e-07 

.800473e-06 

.678798e-05 

-.913509e-04 

-.7608 14e-03 
.544411e-02 
.389112e-01 
(-.135378 
(-.584346 


.000000e+00) 
.  167766e-14) 
.535383e-14) 
-.780734e-12) 
-.436262e-ll) 

.264279e-09) 

.  192069e-08) 
-.616534e-07) 
-.508930e-06) 
.917475e-05) 

.780928e-04) 
-.774224e-03) 
-.611179e-02) 
.307910e-0l) 
.184486  ) 

-.443728  ) 


.21e-07 
.lle-07 
.21e-07 
.  15e-06 
.52e-06 

.20e-05 

.72e-05 

.26e-04 

•86e-04 

•28e-03 

.B7e-03 

.27e-02 

.79e-02 

.23e-01 

.65e-01 

.18 


Table  (2.4.2.2):  Backfilling  yields  enormous  error  for  Z  with 
a  large  variation  in  the  imaginary  parts. 


2.4.3.  Modification  of  step  2  and  step  3 


We  may  assume  the  data  have  been  shifted,  say.  to  have  mean  0.  For  OsiiSfc. 
define  the  bidiagonal  matrix  to  be 


Z™  = 


2-^! 


1 

2't&  1 


1 

2^f„ 


Also  let  the  diagonal  matrix  R  be 
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Our  objective  here  is  to  replace  every  intermediate  "F"  in  step  2  and  3  by 
exp(ZnW).  so  that  we  cam  apply  the  backfilling  technique  and  avoid  the 
storage  for  a  whole  matrix. 

Modified  step  2.  Compute  Fq  =  exp(Z^)  by  TS. 

Modified  step  3.  Compute  Fi  =  for  i  =  l,2,...,A:. 

Lemma.  Ft  =  exp (Z^*"**)  for  Qsiil;,  in  particular,  Fk  =  exp(z40*)  3  exp(Zn). 

Pnxsf.  Assume  Ft  =  exp(Zi^)  for  some  ZstO,  then 

Fl¥l  -  RF?R~l  =  F[exp(Z^~*))]*F"1 
=  Fexp(2Z^-*>)-/r‘ 

=  exp(2  RZ^^R’1). 

Frt»m  the  definition,  it  may  be  verified  that  Z^'1  =  2RZ^*^R~l  for  yfeO. 
Hence,  Ft*  ,*  exp(2j$fc  The  lemma  holds  when  1=0.  By  induction,  we 

have  Fi  =  exp (Z^fc-1J)  for  iiO.  • 

Since  every  intermediate  "F"  is  of  form  exp(Z^),  each  of  them  is  a 
divided  difference  table  (with  different  scaled  abscissae).  By  the  previous 
section.  F  can  be  generated  from  its  first  row.  Hence  it  is  possible  to  do  the 
squaring  (for  the  first  row)  without  keeping  the  whole  matrix 


2.4.4.  Algorithm  for  SS 

Algorithm  SS.  (Scaling  and  Squaring) 

Given  Z  as  in  section  2.2,  this  algorithm  computes  [d(l) . <f(n)]  := 

[A?, A',  ....  A”-1]  by  scaling  and  squaring.  In  what  follow,  vector  s  stores  the 
current  column  of  F  and  vector  r  stores  the  first  row  of  the  current  F. 

551.  [n  =  t?)  If  n  =  l,  return  d(l)=e^‘  and  the  algorithm  terminates. 

552.  [Shifting.]  Set  T7=(2ft)/n  and  replace  ft  by  ft— rj. 
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SS3.  [Scaling.]  Determine  the  least  integer  fc&O  such  that  S^maxjft  |  :s0.?. 
then  replace  by  2-fc£i  for  all  i. 


SS4.  [TS.]  Call  TS  with  Z  equal  to  the  current  s ,  result  goes  to  d. 


SS5.  [Squaring.]  For  fcA:  =  1.2 k  do 

SS5.1.  Set  s(l)=d(l),  for  i=2,3 . n  do 

SS5.1.1  [Back  fill  the  i-th  column  of  F  in  s .] 

*=s(l) 

s(l)=d(i) 

For  do 

V=s(j) 

x=y 
next  j 
s(i)=ex p(fc). 


SS5.1.2  [form  the  (l^i)-th  of  RF*R~1.] 

r(i)=2-w£d(j)S(j). 

i= i 


SS5.2.  [Update  d  and  ft's.] 

d{i)*-r(i)  fori=2 . 

ft^Sft  fori =1.2 . 7i 

d(l)=exp(ft). 


n 


SS6.  [Backfill  the  last  column  of  F.]  Set  s(l)=d(l),  for  i=2,3, 
ar=s(l) 
s(l)=d(i) 


For  j=2,..„i-l  do 

y=s0) 

*  0 )  «* + (ft  -iMj-1) 


do 


nexti 


x-y 
next  j 


SS7.  [Shift  back  and  stop.] 

ft4"ft+T7.  d(i)«-et,-d(i)1  s(i-l)*-e,,  s(t-l)  fori=2 . n. 

set  d(l)=exp(ft)  and  s(n)=exp(ftj  and  the  algorithm  ter¬ 
minates.  ■ 


Remark. 

(l)  If  the  function  FDD  (cf.  §2.2)  for  the  first  divided  difference  is  available, 
one  can  improve  the  accuracy  of  SS  by  using  FDD  whenever  the  first 
divided  difference  is  wanted. 
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(2)  SS6  Is  necessary  for  the  Simple  Hybrid  Algorithm  in  the  next  section, 
otherwise  it  is  not  needed. 


The  backfilling  step  may  not  alway  be  stable.  When  Z  has  a  large  varia¬ 
tion  in  the  imaginary  parts  it  is  likely  that  the  formula  (2.4.2. l)  may  fail  to 
yield  reliable  answers.  In  that  case,  the  straightforward  squaring  is  needed. 
For  completeness  and  for  reference,  we  also  lay  out  the  algoriihm  below. 


Algorithm  SS(II).  (Scaling  and  Squaring  algorithm  (II)) 

Given  Z  and  matrix  F,  this  algorithm  computes  F  =  A  by  scaling  and 
squaring.  For  the  R  in  step  5.1.  cf.  section  2.4.3. 

SS(II)1.  [n=l?]  If  n=l,  return  /'(l,l)=efl  and  the  algorithm  terminates. 

SS(II)2.  [Shifting.]  Set  n  and  replace  Z  by  Z-rj. 

S3(II)3.  [Scaling.]  Determine  the  least  integer  fc&D  such  that 
2-fcmaxJ  ft )  =£0.7,  then  replace  Z  by  2 ~kZ. 


SS(II)4.  [TS(II).]  Call  TS(II)  with  data  Z,  result  goes  to  F. 

SS(II)5.  [Squaring.]  For  kk  =  1,2,. ...fc  do 

SS(II)5. 1.  [Update  F.]  F=RF*R~'. 

SS(II)5.2.  [Update  ft's.]  Z«-2 Z. 

SS(II)5.3.  [Update  F(i,i).]  F(i,i)-ex p(ft)  for  i=l,2 . n. 


SS(II)6.  [Shift  back  and  stop.]  Z*-Z+r).  F*-*v-F  and  the  algorithm  ter¬ 
minates.  • 


Operation  count  and  storage.  The  major  part  of  this  computation  is  the 
squaring  step,  which  is  repeated  k  times.  The  operation  count  for  each 
squaring  is  nz+0(l)  in  SS5  and  n3/  6  -nz/2  +  0(n)  in  SS(II}5  (with  n  func¬ 
tion  ca]l  on  exp).  Hence  the  total  operations  need  are  »fcnz  in  SS  and 
»fcn3/6  in  SS(II),  where  k  is  the  least  non-negative  integer  such  that 
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2 ~ky' s  0.7.  Therefore  when  7>0.7,  &=[log2(7/ 0.7)+l]T  «  log27+l-5-  Four  n- 
vectors  are  needed  for  storing  d,s,r  and  Z  in  SS  and  a  matrix  storage  is 
needed  in  SS(II). 

Accuracy.  SS  may  be  viewed  as  an  extension  of  TS  (Taylor  Series).  It  can 
accept  moderately  spread  data  without  suffering  as  much  as  TS  (cf.  §4.3). 
The  follow  example  illustrates  the  big  difference  between  TS  and  SS. 


Numerical  example.  Let  Z=[-16, -12, -8, -4,0, 4,8,12,16].  We  compare  TS  and 

SS  in  the  computation  of  &?(£)  for  A:  =  1,2 . 8.  Results  are  summarized  in 

the  following  table.  The  values  in  the  last  two  columns  are  the  magnitude  of 
the  relative  error  in  the  correspond  divided  differences  ;  notice  the  enor¬ 
mous  error  in  the  first  few  Li[Z)  for  IS. 


correct  values 

to  6  digits 

TS 

op:  645  (for*) 

SS 

op:  1018 

relative 

error(SS) 

relative 

error(TS) 

.150792e-05 

.281912e-01 

,150792e-05 

.12e-06 

.4 

-101027e-04 

.281155e-03 

.101027e-04 

.20e-06 

.83 

•451239e-04 

.242131e-03 

.451239e-04 

.36e-06 

.37 

,151160e-03 

.154376e-03 

.15116Ce-03 

.60e-06 

•21e-01 

•405094e-03 

.405302e-03 

,405095e-03 

.88e-06 

.51e-03 

.904679e-03 

.904669e-03 

.904680e-03 

,12e-05 

.lle-04 

.173175e-02 

,173175e-02 

.173176e-02 

.14e-05 

.48e-05 

.290059e-02 

.290059e-02 

.290059e-02 

.17e-05 

.73e-06 

Table  (2.4.4.  l):  Divided  differences  on  Z,  TS  vs  SS. 


*  Here  7jr7(Z)*ruxl{l-('E(()/n  |  is  the  "radius"  of  Z  (after  it  has  been  shifted) 
t  Here  [x]  denotes  the  greatest  integer  that  less  than  x. 


jamaiai 
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3.  Hybrid  Methods 


3.1.  Example 

Our  discussion  so  far  suggests  that  it  may  be  possible  to  compute  A  accu¬ 
rately  by  combining  the  two  methods  (SR  and  SS)  of  section  2.  Let  us  con¬ 
sider  the  following  task: 

"Given  Z=[50i,  10-s+50i,  —  10-s-50i,  -50i],  compute  A*Aexp." 

In  addition  to  SR  and  SS,  we  can  compute  A  by  the  following  "mixed"  method. 
Decompose  A  into  a  2x2  block  matrix  and  name  the  blocks  I,  II  and  III, 


A?  A/ 

A?  A? 
Ai  A | 

I 

I 

A3  Aj 

A* 

I 

Since  and  are  close  together  (  also  {3  and  ),  SS  is  right  for  them  and 
we  use  SS  to  compute  I  and  II.  Then  we  use  SR  to  fill  up  III. 

In  order  to  compare  this  mixed  approach  with  SS  and  SR.  we  ran  these 
three  algorithms  in  24-binary  digit  (~7  decimal)  arithmetic.  The  results  are 
summarized  in  the  following  table.  For  simplicity  we  only  compare  A?  and 
A?.  The  symbol  p.  in  the  last  column  stands  for  multiplication  or  division; 
thus  6p,  4exp  means  six  multiplication/divisions  and  four  cedis  to  exp  are 
needed. 


_ A! _ 

A? 

SR 

SS 

Mixed 

Exact 

1 

r*.26226Ce-C2  -.97CS43e-02) 
-.262376e-C2  -,9702l9e-02) 
[-.262376e-02  -,9702t8e-02) 
'-.262376d-02  -.97C2l8d-021 

(-.  193573e-03  -.55S794e-10) 
(-.194G77e*G3  -.295204e-07) 
(-.194C43e-03  .207219e-C9) 
f-.194043d-C3  .204t62d-09) 

6p,4exp 

196/x.lOexp 

26/x.lCexp 

The  following  should  be  noticed: 

(1).  SR  gives  poor  results  on  A?  and  Ap. 
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(2) .  The  answers  of  SS  are  not  bad.  This  shows  that  3S  can  indeed  accept 

moderately  spread  data,  but  the  price  is  high. 

(3) .  The  mixed  method  gives  the  most  accurate  answer. 


3.2.  Simple  Hybrid  Method 

The  example  in  §3.1  shows  that  when  one  can  group  the  data  into  clusters 
(allow  overlap) 


z  =  [$  . . - - «rJ 

iV  V 


then  one  can  compute  A(Z)  by 


This  clustering  should  satisfy 


(1)  within  each  diagonal  block  of  Zn  the  data  are  close  enough  together  so 
that  SS  may  be  used  for  the  corresponding  block  in  A, 

(2)  data  belong  to  different  blocks  should  be  sufficiently  separated  so  that 
SR  cam  be  used  to  fill  up  the  rest  of  A. 

This  mixed  approach,  which  we  call  the  simple  hybrid,  method  (SK), 
demands  a  suitable  ordering  on  the  data  Z.  Such  an  ordering  brings 
together  all  close  abscissae  and  we  may  call  it  a  nested  ordering  (to  be 
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defined  precisely  in  §3.3).  Under  a  nested  ordering,  the  radius*  of  each 

[fc.Cfi .  is  close  to  the  distance  between  the  end  points.  In  other 

words,  if  and  are  close  together,  then  ail  . are  close 

together.  In  that  case,  we  can  group  the  abscissae  as  follow  :  the  data 
fi.fi+.j.  ....  Ci+i  will  be  in  the  same  cluster  if  |  Ci+k~Ci !  less  than  some  value 
g  ■  This  g  depends  on  k  (the  number  of  points  in  the  data  set)  only  and  we  will 


discuss  the  value  of  g  =gk  for  each  k  in  §4.4.  For  the  time  being,  assume  gk  is 
given,  we  are  ready  to  describe  the  simple  hybrid  method. 

Method  SR 

[1] .  Determine  the  clustering. 

[2] .  Compute  the  clustered  block  (shaded  area  of  Figure  3.2.1)  by  SS.  Notice 

that  we  only  need  SS  to  return  the  first  row  and  the  last  column  of  eac  a 
block. 

[3] .  Fill  up  the  rest  to  the  first  row  by  SR. 


In  practice,  [l],  [2]  and  [3]  are  alwayj  combined  for  each  cluster.  Here 
is  an  implementation. 

Algorithm  SH  (Simple  Hybrid  Algorithm). 

Given  Z  and  the  decision  function  C,  this  algorithm  computes 
[rf(l) . cf(n)]  :==  [Ap.Ai1 . A”-1]  by  the  simple  hybrid  method.  In  what  fol¬ 

low.  vector  s  will  store  the  last  column  of  the  current  cluster,  vector  d  will 
store  the  first  row  of  the  current  cluster,  will  be  the  currently  computed 
row  number  (of  A)  and  v,  j  will  be  the  first  and  last  index  of  the  next  cluster. 

SHl.[n  =  l?]  If  yes,  set  d(l)=exp((’1)  and  the  algorithm  terminates. 

SH2. [Initialize.]  Set  /4=min|i:  Ift-fn  I  ^  9n-i  J  and  compute  the  fj.- th  row  of  A 

iin 

by  calling  SS,  result  goes  in  Set  j=n 


l  Thu  radius  of  2  is  defined  to  'be  y{Z)  ■  max  |  where  Vs  (Sfi  V  n . 


22 


section  3 


SI!3.[/x=l?}  If  yes,  the  algorithm  terminates. 


SK4.  [Loop.] 

SH4.  l.fFind  the  next  cluster.]  Find  cluster  [v.j ].  . 

(a ) .j=j-l 

(b) .  t/=min[i :  |  ^  |  <;  gj  with  i  <j  $ 

(c) .  if  y£v  then  go  back  to  (a)  else  SH4.2. 

SH4.2.[Update  d  from  1/ to  j] 

SH4.2.  l.[call  S3  on  [f . . ].  ] 

Results  go  to  d\v) . d(j)  and  s(l),...,s(j  —u+l) 

s  is  the  last  column  of  the  cluster. 

SH4.2.2.[Fill  up  d(j  + 1) d(n)  by  SR.] 

For  k-y—u,  y— v—  1 . 1  do 

d(j)=s  («) 

for  i=j  +  l,j  +2 . n  do 

d(i)=[d(i)-rf(i-D]/[fi-^k-1] 

next  i 
next  k 

SH5.  [Update  y.~\  Set  y-v  and  j  -j  —1.  Go  back  to  SK3.  • 


Operation  count  and  storage.  The  total  number  of  operations  depends  on 
the  clustering.  The  worst  case  might  take  0(n3)  but  it  would  be  very  rare, 
e.g.,  if  ■Z’=[l,2.3,..,,2n]  and  g}  =n  for  any  j,  then,  there  will  be  exactly  n  clus¬ 
ters  and  each  cluster  has  n  data  points,  which  means  n- 0(nz)  =  0(n3) 
operations  are  needed  (cf.  the  figure  below). 


Such  a  situation  is  very  unlikely  to  happen  for  a  realistic  set  of  gk.  At  =  1 . 2. _ 

For  our  decision  constants  (will  be  discussed  later),  the  operation  count  is 
usually  0(n3).  Storage  requirements  will  be  the  same  as  SS. 


23 


section  3 


3.3.  Ordering  problem 

When  Z  is  not  nested,  one  may  not  be  able  to  group  the  data  to  have  proper¬ 
ties  (l)  and  (2)  in  §3.2.  In  that  case,  a  much  more  sophisticated  combination 
of  SS  and  SR,  a  recursive  hybrid,  method  .  may  be  needed.  Let  us  consider  a 
different  example  Z=[— 50,  50,  50,  -50].  Since  the  first  and  the  last  elements 
are  equal,  we  cannot  use  SR  for  A?  and  hence  the  whole  of  Z  should  be 
treated  as  one  block.  But  then  SS  is  not  that  suitable  because  the  radius  of 
Z  is  large.  However,  instead  cf  the  whole  Z  we  consider  the  subset  [-50.  50. 
50]  (which  can  be  grouped  into  two  clusters)  and  obtain  the  first  three 
divided  differences  Ai.A’.A?.  As  for  the  last  one,  we  make  use  of  the  fact  that 
it  does  not  depend  on  the  ordering  of  Z.  and  thus  compute  A?  by  considering 
the  reordered  data  set  [-50.  -50,  50,  50].  Notice  that  both  [—50,50,50]  and 
[-50,-50,50,50]  can  be  clustered  for  SH. 

The  disadvantage  of  ;he  above  method  is  that  in  some  sense  the  first 
three  divided  differences  have  been  computed  twice.  Had  we  known  in 
advance  that  the  reordering  would  be  necessary  we  could  have  avoided  the 
repetition  ;  for  in  our  application  the  abscissae  can  be  arranged  in  any 
order  ta  give  a  Z  but  then  it  is  A (Z)  which  must  be  computed.  It  thus  raises 
the  question: 

Does  there  exist  a  nested  ordering  for  any  given  Z  ? 

The  answer  is  yes  when  Z  is  real  (the  natural  increasing  ordering)  but  not 
always  in  general,  e.g.  consider  data  that  form  a  circle  in  the  complex  plane. 
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Before  we  discuss  the  details  of  the  recursive  hybrid  methods,  we  men¬ 
tion  the  decision  function  G  and  the  decision  constants  gk.  A:  =  1.2 .  Given 

any  abscissae  W  with  A:  points,  G(PK)  yields  a  pair  of  points  cJi.cjjCiy 

such  that  |  cji—Uj  ]  is  an  approximation  of  the  radius  of  IV.  As  in  §3.2,  the  deci¬ 
sion  whether  we  should  apply  SS  on  the  whole  of  IV  becomes  the  test 
| cji -Uj  |  ^  g,e.  where  gk  depends  on  A:.  Examples  for  G(F/')=(i>M,ov)  , 
are 

(3.3. 1)  |  =diam(fV). 

(3.3.2)  Re(cjM-cjJ=dia/n.(Re(iy)). 

(3.3.3)  |cj„— vv\  =max|cjt-cj.- 1.  A=|«i€ fy;Re(cji)=maxRe(cj,) 

tijCA  j 

We  will  discuss  G  in  section  4  and  5.  Now  assume  that  G  is  given  and  use  it  to 
define  a  nested  ordering  : 

Definition  3.3.4.  Z  is  nested  (with  respect  to  G )  if 

^([{•i.fiM . {*«*])=({■»«■*.{*)  for  any  lsi.l^fc.i+fcjSn.- 

It  is  easy  to  verify  that  if  G  is  one  of  (3.3.1  -  3),  and  if  Z  is  real,  then  an 
arrangement  of  in  increasing  order  gives  a  nested  ordering. 
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3.4.  Recursive  Hybrid  Method 

Every  divided  difference  can  be  computed  by  A*  =  RH([£t . f*4-i])>  where 

the  function  RH  is  defined  below. 


Recursive  Function  RH  (Z). 

This  function  computes  the  highest  order  divided  difference  on  the  given 
data  Z.  Let  k  denote  the  number  of  points  in  Z,  then  RH  return  Jif-l(2)exp. 


1 

’2 

'3 


.  If  A:  =  1  return  (exp^)). 

.  Compute  G(Z)-{\^V). 

•  ^  I  f  „ I  ^  9t  call  S3  and  return  (d(fc))  else  return  the  following 

rh(zm)  -  nm.zM) 

- <3-4' 


where  2 . . fnl-  • 


We  leave  the  details  of  the  proof  that  RH  does  return  the  highest  divided 
difference  to  the  reader.  Notice  that  when  Z  is  nested. 

Cr([fi . £i**])=(£i*fc.£i)  and  the  above  decision  (step  [3])  means  that 

Aftft . 6b+t]  should  be  computed  by  SS  if  ft  |sEgrfc.  which  is  exactly 

what  SH  did.  Thus 


RH  reduces  to  SH  if  the  abscissae  are  nested.  » 


Since  the  operation  count  of  RH  could  be  enormous,  like  0(cn),  one 
would  hope  to  find  a  nested  ordering  for  the  ft's  to  determine  Z  and  then 
apply  SH  on  it.  A  practical  modification  is  to  attempt  to  nest  the  abscissae 
(according  to  C )  before  steps  [2]  and  [3].  If  it  can  be  done,  then  SH  can  be 
applied  to  the  rearranged  Z  (recall  that  the  divided  difference  does  not 
depend  on  the  ordering  of  the  data).  Later  on  we’ll  see  that  the  abscissae 
can  always  be  taken  close  to  real  (cf.  §5.3)  and  consequently  ordering 
according  to  the  real  part  give  an  almost  nested  ordering,  see  §5.4. 

Our  purpose  in  introducing  RH  is  to  show  that,  in  principle,  A*(Z)exp 
can  be  computed  accurately  using  fixed  precision  arithmetic. 
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4.  Real  Exponential  Divided  Differences 


Exponential  divided  differences  for  real  abscissae  are  positive  and  increasing 
functions  of  their  abscissae.  These  properties  permit  derivation  of  bounds  on 
the  error  growth  in  SR  (Standard  Recurrence)  and  SS  (Scaling  and  Squaring). 
For  future  use,  we  consider  the  more  general  function  expT  with  scaling 
parameter  r.  that  is  expr(f)=eT*.  For  simplicity  we  write 


rln 

exp}n\()=— — (expr)(f)-  In  the  rest  of  this  section,  we  consider  exclusively 
d(n 

divided  differences  on  real  abscissae  even  if  some  of  the  pro¬ 

perties  hold  for  general  complex  abscissae. 


4. 1.  Basic  theorems  and  properties 

Translation  and  seeding  invariance  property.  Let  U  be  the  constant  vector 
[1,1,..., l].  Then  for  any  constants  r.a, 

Ap-KX+cx  t/)expr=eTa-Ap~1(X’)expT,  (4. 1. 1) 

and 

AP~1(-^)e3CPr=Tn~lAP_,(T^)exp.  (4.1.2) 

Proof.  (4.1.1)  follows  easily  from  the  matrix  equation  exp(,4  +aJ)=ea  exp(A ) 
(using  (1.5.3)),  and  (4.1.2)  follows  from  (1.3.1)  directly. 


Recursive  integral  formula.  For  given  X  and  any  7^0,  i=l,2,...,n,  we  have 

Af’expr  =  eTfl7"e-of<'A(Jf8expacfcr,  (4.1.3) 

o 

where 


Aft"2/  =  Ln~2(X{i))f.  X{i)=X\\^l  (4.1.4) 


Proof.  From  the  Hermite-Gennochi  integral  representation  formula  (1.3.1), 
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we  have 


»  *1  vn~i 

Ar-1expT=//-  /  exp^_15[fl+(fJ-f1)i/1-t-  +(fll-fB_,)vn_,]<ii/n_1-  <iv1 
oo  o 
i  'l  "n-a 

0  0  0 

by  the  definition  of  exp^  The  change  of  variables  Oj=TVj  for  j'=1.2,...,n-l 
yields  the  alternative  expression 

r*i  »»-i 

Ar~‘ex?r»/7  •••  /  (4.1.5) 

0  0  0 

We  recognize  that  this  is  a  recurrence  for  Ap^exp,*  namely 


A""‘exp T=er(tf  a  ~*(l-  A?  "^exp^do 
0 

where  0*0^  By  the  symmetry  property  (t.2.  l).  the  ordering  of  the  abscissae 
is  arbitrary:  we  may  replace  f,  by  any  £<t  1  £is;n,  hence  establishing  the  for¬ 
mula.  • 


Theorem  1.  For  all  r> 0  and  A:i0,  A*expr  is 

(i)  positive. 

(ii)  strictly  increasing  in  each  abscissa  for  i  =  l,..,n. 

Proof,  (i)  follows  from  the  mean  value  representation  (1.4.1).  For  (ii), 

£~AP~,expT=^(r-o)e(r-*>f‘  Aflf*exp,do  >  0. 
since  the  integrand  is  positive. 

Theorem  2.  Suppose  y  for  each  abscissa  (v,  lsisn.  Ttien  for  each  i 

there  exists  a  (€[£,7]  such  that 


A(\f*expTs($ -<4 4  )  A{*"lexpT. 


(4.1.6) 
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Proof.  By  (4.1.1)  and  (4.1.2), 

Ar-l("C^-£  C0)exp  =  T-(n~,)e-Tf-AJ*-,(A')expT 

=  ®  f  e~ati  ■  k$)2(X)expado 

o 

for  any  t=l,2,...,n  and  £.  Differentiating  with  respect  to  r  yields 

jpArKTtf-f C0)«p  =  r-^-0.^[(f1-f-^  &f-t(y)expT+A^(y)expT].  (4.1.7) 

Every  element  of  the  vector  X-fiU  is  non-negative,  and  so  A?~1(t(X—0 C/))exp 
is  increasing  in  r.  Similarly,  every  element  of  X-yU  is  non-positive  and 
Ain-,(r(A‘-^yC/))exp  is  decreasing  in  r.  Hence 

•ffir*i~Kr(X-pU))ex p  St  0  2s  ■~^~x{T{X~yU))ex'g> 
so  for  some  f  e  [/3,-y],  the  derivative  is  zero.  The  result  then  follows  from 
(4.1.7).- 

Corollary  1:  Lower  bound  on  AP^exp*.  If  £n  &  for  each  £=1,2 . n,  t  .ien 

Ar_1expT&  -~-  AP_2expT.  (4.1.8) 

71 

Proof.  Choose  £=n,  y=£n  in  (4.1.6),  and  note  that  <  0.  • 

Corollary  2:  Upper  bound  on  Ap-1expT.  If  £,  ■&  for  each  £=1,2 . n.  then 

Af-,exprs  1—A?~2expr-  (4.1.9) 

71  -“1 


4.2.  Error  growth  in  Standard  Recurrence 

We  now  examine  the  error  growth  of  one  step  SR  when  A"  is  in  increasing 
order.  Equation  (4.1.8)  leads  directly  to  a  bound  on  the  relative  error  growth 
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in  one  step  of  SR.  Let  e *  be  the  relative  error  in  Af=A*exp.  i.e.. 

l+£,fc)-A*.  where  fl(Ai)  is  computed  by  SR.  For  simplicity  let  us  first 
assume  that  the  recurrence  step  (2.1.1)  is  done  exactly,  in  which  case  e* 
may  be  regarded  as  the  inherited  uncertainty  of  A*.  We  have 

/I  W)  = 

(i+£^V)A<M'-(i+£r1)^-1 


After  some  algebraic  manipulation,  one  obtains 

' A*~l]  max{ | eft;* I . I *f“l| I- 

By  (4.1.8),  since  £h.m^£j  for  i£j£i+k, 


|  /l(Af)-A* 


2k 


|  s  *  [1+  i-^r].max{  |  | .  |  tf'  I  i 


Therefore,  we  have 

Uncertainty  growthr  of  one  step  of  SR  (with  X  in  increasing  order) 

1**1  ^  [1+ 1  2-fcv  ]  max{  |  efo1  [ .  1  ef  ~M  {•■  (4.2.1) 

U*k~U 


This  bound  is  quite  realistic.  Take  the  example  in  §2.1:  Z=[  1,  1.0001].  Both 
A?=el  and  A°=e ,  0001  can  be  computed  accurately -with  |ef|,|e£|£e  •  so  equa¬ 
tion  (4.2.1)  predicts  |e^|^20001e.  In  §2.1,  with  e=5-10"8,  we  have 
ej =(2.72-2.7 184...)/ (2.718...)  «  0.000582  w  11641e. 

In  finite  arithmetic,  the  execution  of  SR  may  introduce  some  roundoff 
error  to  Af.  An  error  analysis  in  appendLx  B  shows  that  only  a  small 
modification  of  (4.2.1)  is  needed  to  incorporate  the  effects  of  roundoff  into 
the  propagation  of  uncertainty. 

T  This  is  the  growth  of  the  uncertainties  in  the  data.  As  long  as  there  are  uncertainties  in  the 
input,  SR  will  propagate  them  even  if  the  arithmetic  of  each  step  is  done  exactly. 
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Error  growth  of  one  step  of  SR.  Provided  that  £tfc  1  and  are  small*, 
we  have 

|  tt\  <4 s  +  [1+  J~r~ ]' maxi !  *<+'  I  •  I  E*~l  I  i  '  (4.2.2) 

si+Jb  si 

Proof.  See  appendix  B.  ■ 

4.3.  Error  bounds  on  SS 

Based  on  the  positivity  of  (Theorem  1  in  4. 1),  we  can  apply  standard  error 
analysis  to  obtain  relative  error  bounds  on  SS.  For  example,  in  the  squaring 
step,  each  entry  of  the  matrix  is  positive  and  therefore  no  cancellation 
occurs  and  we  have 

|  fl(Bz)  -Bz |  ssne  (52). 

where  |£’|  denotes  the  matrix  all  of  whose  elements  are  the  absolute  values 
of  the  elements  of  E  and  our  notation  A^B  means  that  for  every  i 

and  j. 

A  detailed  error  analysis  of  Algorithm  SS(II)  is  presented  in  appendix  B. 
As  a  direct  corollary  of  equation  (B.6),  we  have  the  following  bound  : 

Scaling  and  squaring  error  bounds.  Given  real  abscissae  X  in  increasing 
order,  denote  the  relative  error  of  A/(Jf)  by  e/  as  in  the  previous  section,  and 
recall  ysrnaxlft— r)\  where  77  is  the  arithmetic  mean  of  f*.  For  convenience 

set  7‘sEmax(7,0.7).  We  have 

Jefl  s:  (C0*  +  C,fcln/+C2/  -2)  e  ,  (4.3.1) 

B  for  details.  In  general,  it  suffices  to  require  them  less  than 


1  See  appendix 
JL.(,4.  ?*  -pi. 

4 
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where  C0=  13.63  and  C,  =  1.4427.  C2  depends  on  e;  in  particular  Ca=134.4 
when  e=2-24,  and  Ca=2 15.426  when  e=2-5S. 

Proof.  See  appendix  B.  • 

Remark.  The  bound  on  e*  is  quite  pessimistic.  Numerical  results  show  that 
most  of  the  time  the  constants  Q  should  be  reduced  to  0.01  times  their 
values  given  above  (cf.  the  remark  in  appendix  B)). 


4.4.  Decision  criteria  for  the  hybrid  methods. 

Using  the  bounds  in  the  previous  section  we  demonstrate  that  one  can  deter¬ 
mine  G  and  so  that  the  recursive  function  RH  (for  the  highest  divided 
difference)  always  yields  a  result  with  bounded  error.  For  convenience,  we 
write  to  indicate  that  X  has  n  abscissae.  The  function  RH{X^n^)  is  : 

(1). 

(2) .  Compute  where  £Msmax£t  and  £„amin{(. 

(3) .  If  |  <;  g caU  SS(II)  and  /?//(*<">)  :=  (cf(l.n))  else 

m*")  ***&*)  ZpX&H  .  (4.4.1) 

where  *[£„& . . £n  J- 

Based  on  the  bounds  (4.3.1)  and  (4.2.2),  we  are  going  to  show  by  induction 
that 

there  exist  some  constants  gj  and  s^\  where  j  =  1,2,...  such  that  for  any 
n  and  X^n^  the  relative  error  e”-1  in  RH(X'n^)  is  always  bounded  by 


Proof. 
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Step  0.  When  n  =  1.  &*}(X)-RH(X)=exp(( ,).  Therefore  can  be  set 
equal  to  e  (we  assume  function  exp  can  be  evaluated  accurately.  i.e.  |t?|<e). 

Step  1.  When  n-2,  assume  <  £2,  then  G(X)=(tz.£l).  Let  -6=f2— £j.  To 
compute  RH{X),  SR  yields  (cf.  4.2.2) 

|e,l|<:4e+(l+4/tf  )  c(°)  <;(5+4/d)s  (4.4.2) 

and  SS  yields  (cf.  4.3.1) 

\e}\  <[2C0+2C1logy-hCsy-2]  e.  (4.4.3) 

Since  ysmax(7,0. 7)  <  max(i$,0.7)  (4.4.3)  becomes 

|e1M<[2£7o+2C1log(max(T5.0.7))  +  C2(max(T3.0.7))-2]e.  (4.4.4) 

Notice  that  the  bound  in  (4.4.2)  is  monotonic  decreasing  in  and  the  one  in 
(4.4.4)  is  monotonic  non-decreasing  so  they  have  only  one  intersection.  Let  it 
occur  at  It  means  that  e}  will  always  be  bounded  by  =  (5+4/<7,)e  if 

one  computes  RH(X)  by  SS  when  G{X)  <gx  and  by  SR  (i.e.,  by  (4.4.1))  other¬ 
wise. 

Step  2.  Assume  that  for  1  <  n  the  assertion  is  true.  i.e.,  e”-1  in  RH(X^n^) 
is  bounded  by  some  constant  for  any  X=X^n\  Consider  X-X^n*lK  Let  t' 

denote  |(„— (^1.  where  G(^"*'J)=(£ „,fM).  To  compute  RH(X),  SR,  or  equation 
(4.4. 1)  yields 

|e”|^4E+(l+2n/i3  )  (4.4.5) 

and  SS  yields 

I  epl  <[nCo+nCilog(max(t>.0.7))4-Cz(max('!>,0.7))— 2]-e.  (4.4.6) 

Again  the  bound  in  (4.4.5)  is  monotonic  decreasing  on  and  (4.4.6)  is  mono¬ 

tonic  non-decreasing  on  so  they  have  only  one  intersection  and  let  it 
occurs  at  &  =  gn-  Therefore  e,n  will  always  be  bounded  by 

e(»»)  a  4e+(l+2n/0n_l)-e(,‘-1!  if  one  computes  RH(X)  by  SS  when  |  <  gn 
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and  by  (4.4.1)  otherwise. 

By  induction,  our  assertion  is  true  for  all  n.  » 

One  can  generate  those  gj.  s^'1  recursively  by  equating  the  bounds  in 
(4.4.5)  and  (4.4.6)  and  solve  it  for  j  =  1.2,...  with  the  initial  value  e^=e  : 

4«+(1+2j/i3  )  eW_h  -  L/C’o+iC,ilog(max(i>,0.?))-i-Ca(inax('i>,0.?))-2]  e  (4.4. ?a) 

with 

gO)  =  [j‘Cc+jCilog(max(-<S,0.7))  +  (?2(mas(-6,0.?))—  2]e.  (4.4.7b) 

For  e=Z~z*.  We  compute  some  of  the  gj  and  according  to  (4.4.7)  and  list 
them  in  Table  (4.4.8).  Therefore  we  have  shown 

Relative  error  bound  in  RH(X>n)).  When  s=2~24,  we  have 

where  gj*-1  is  the  relative  error  of  Af-1  and  the  value  of  gj  and  e^)  are  given 
in  Table  (4.4.8).  ■ 
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j 

9j 

Error  bound 

e^/e 

digits  lost 

O°gio(  )) 

1 

0.02 

0.105e+03 

2.02 

2 

2.100 

0.310e+03 

2.49 

3 

4.846 

0.697e+03 

2.84 

4 

9.227 

0. 131e+04 

3.12 

5 

15.41 

0.216e+04 

3.33 

6 

23.48 

0.326e+04 

3.51 

7 

33.50 

0.463e+04 

3.67 

8 

45.48 

0.626e+04 

3.80 

9 

59.46 

0.316e+04 

3.91 

10 

75.42 

0. 103e+05 

4.01 

20 

345.4 

0.469e+05 

4.67 

40 

1487. 

0.201e+06 

5.30 

60 

3430. 

0.462e+06 

5.67 

80 

6173. 

0.832e+06 

5.92 

100 

9716. 

0. 131e+0? 

6.12 

Table  (4.4.8).  Single  precision  decision  criteria  (e=2  24) 
and  error  bounds  for  the  hybrid  algorithm. 


Remark  1.  The  asymtoptic  value  of  gt  is  i2  +  0(i),  which  can  be  seen  form 
the  equation  e W  =  C^gk  E  and  C2gk+i=(l+  — )Ci3h  obtained  by  omitting 

the  lower  order  terms  in  (4.4.7).  One  can  verify  by  induction  that 
k2—3k  <  gk  <k2  and  consequently  the  error  bound  =  (Cg  kz  +  0{k))s. 

Remark  2.  Although  the  error  bounds  in  Table  (4.4.8)  are  not  ridiculous, 
they  are  quite  pessimistic.  Also  the  value  of  gj  in  the  above  table  is  too  large 
to  be  useful.  For  example,  when  n=20,  gzz= 345.4  and  it  means  that  AZQ  is 
computed  by  (Ag19  —  Aj9)/  (fzi— £j)  only  if  >  345.4  !  Experience  shows 

that  as  long  as  j  >  25  or  28,  SR  always  yields  satisfactory  answers. 

Since  SR  is  much  faster  that  SS,  one  prefers  SR  to  SS  whenever  SR  yields 
satisfactory  results.  So  we  would  like  a  set  of  values  for  gj  and  which  is 
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more  realistic.  Aiter  numerous  numerical  experiments  we  obtained  the  fol¬ 
lowing  experimental  formula  for  gj  and  £^0  (for  „r.y  precision  s). 

Experimental  formula. 


9j  =  (1+  ^-)‘J 

e(J)  = 


(4.4.9) 


The  practical  value  for  g}  is  much  smaller  that  the  one  in  Table  (4.4.9)  (it  is 
like  j  +0. Ij \nj  V.S.  j i2).  For  comparison,  take  ,7=40.  (4.4.9)  yields  54.76  while 
(4.4.8)  yields  1487!  We  ran  our  SH  (with  gj  in  (4.4.9))  on  a  Z  that  has  20  data 
points  distributed  irregularity  from  -27  to  25.  The  results  are  summarized  in 
Table  (4.4.10).  The  last  column  "digit  lost"  is  log10(relative  error). 


71 

B 

HTTTTTHffli 

■SZioaB 

SH  A”-1 

digits 

lost 

1 

-27.0 

0. 1879529e-ll 

0. 187S529e-l  1 

0. 

2 

-26.0 

0.3229560e-ll 

0.3229560e-ll 

0. 

3 

-15.0 

0.23l7134e-08 

0.2317134e-03 

0.09 

4 

-14.0 

0.3012897e-03 

0.3012397e-08 

0. 

5 

-12.0 

0.2993S82e-09 

0.2993632e-08 

0. 

6 

-10.0 

0.224640  le-09 

0.224640  le-08 

0. 

7 

-8.0 

0. 1353474e-08 

0. 1353474e-03 

0.50 

8 

-7.9 

0.4257 157e-09 

0.4257 158e-09 

0.56 

9 

-7.8 

0.9465834e-10 

0.946583 8e-10 

0.61 

10 

0.4272183e-10 

0.4272156e-10 

0.96 

11 

3 

0.2364207e-10 

0  2384209e-10 

1.10 

12 

i.i 

0.637S568e-ll 

0.637S574e-ll 

1.22 

13 

0.120S901e-ll 

0. 1203802e-ll 

1.33 

14 

BUI 

0.1806541e-12 

0. 1806544e-12 

1.45 

15 

3.0 

0.270659  le- 13 

0.2706596e-13 

1.49 

16 

7.0 

0.5415335e-14 

0.5415346e-14 

1.53 

17 

9.0 

0. 1022545e-14 

0. 1022547e-14 

1.53 

10 

13.0 

0.2453 144e-15 

0.2453151e-15 

1.65 

19 

24.0 

0.380400 Oe-15 

0.3803999e-15 

0.72 

20 

25.0 

0. 1456325e-15 

0. 1456325e-15 

0.54 

Table  (4.4.10).  Test  example  for  Simple  Hybrid  Method. 
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5.  Complex  Exponential  divided  Difference 

5.1.  Can  we  have  high  relative  accuracy? 

As  we  have  seen  in  section  4,  the  real  exponential  divided  differences  can  be 
computed  with  high  reLative  accuracy.  What  makes  it  possible  is  that 
) = bf(X ) e xp  is  positive  for  real  X.  This  property  fails  for  complex  data  Z. 
for  A j(Z)  can  take  on  any  complex  value.  However,  one  can  still  say  some¬ 
thing  about  the  error  in  A j(Z).  In  order  to  do  that  some  extra  notation  is 
needed.  Let  X  and  Y  be  the  real  and  imaginary  part  of  Z,  i.e.,  if 

. Cnl.  then  *=[$!,£> . £n]  and  Y=lrji,r/Z . 77,,]  so  that 

{k-Sk+i-Vk  f°r  fc=1.2 . 7i.  Also  let  A j{W)  denotes  the  exponential  divided 

differences  on  the  abscissae  W.  Our  treatment  of  error  in  the  complex  case 
is  based  on  the  following  inequality. 

Lemma.  With  the  notation  given  above 

|Af(Z)l  *A?(X)  (5.1.1) 

Proof.  Use  the  Hermite-Gennochi  expression  (1.3.1)  for  h£{Z)  and  note  that 

|exp[fi+«-i^i-<-i)vl+...}|  =  exp[£i-K{i«.,-£i)t'1+...]  .  • 

Inequality  (5.1.1)  enables  us  to  bound  the  error  in  the  computed  At'(T)  in 
terms  of  Sf{X).  The  bounds  are  similar  to  those  in  section  4.  We  summarize 
the  results  below,  and  leave  the  details  to  appendix  B.  Let  c  be  the  unit 
roundoff  and  e*  be  the  absolute  error  of  A*(Z).  i.e.,  fl(bi(Z))~^i(Z)+ef. 
Define  £*.  the  pseudo  relative  error  in  A *(2T),  to  be  s*=e* /hf(X). 

(l).  Error  growth  of  SR  (Standard  Recurrence).  Suppose  that  A ,fc(Z)  is 
computed  by  SR,  and  also  Re(^i<.fc)^Re(fJ  )  for  i£j  <i+k.  Then,  to  first  order 
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in  e,  the  pseudo  relative  error  e*  satisfies  : 

| r*i^4e+[l+  j  ]-max{ j efo1 1 . 1  e*~M i--  (5. 1.2) 

Proof.  See  appendix  B.  • 

(2).  Error  bounds  of  SS  (Scaling  and  Squaring).  Let  the  radius  y  be 
defined  as  in  §2.4.  Suppose  that  L(Z)  is  computed  by  SS(I1 ).  Then  to  first 
order  in  s  we  have 

Error  bound 

|  c*|  <  [Ccfc  +  Cifcln(max(7.0.7))+C2-rnax(7.C>.7)~2]-£  (5.1.3) 

where  C*.  i-0, 1,2  take  the  same  values  as  in  (4.3.2). 

Proof.  See  appendix  B,  Corollary  (B.6).  • 

The  above  bounds  for  complex  abscissae  Z  are  similar  to  those  for  the 
real  one  in  section  4,  except  that  the  meaning  of  the  error  e*  is  different: 
here  ef  is  the  error  in  &*(Z)  relative  to  if(X).  The  same  analysis  as  in  sec¬ 
tion  4.4  shows  that  the  hybrid  methods  yield  small  ef  like  0(kz)s.  Le..  yields 
&i(Z)  with  small  absolute  error  compared  to  provided  that  the  deci¬ 

sion  function  G  satisfies: 

(1)  G{Z)=(^„)  and  | ay 

(2)  Re(^jJ  i  Re(ft)  for  any  foeZ. 

It  leads  to  the  definition  (3.3.4)  for  G,  i.e.,  G(Z)=(£fl,tu)  such  that 

!  where  A  &}ti€Z:Re(t()=maxRe({j)  j 
€«<A  *  (5. 1.4) 

<j*Z 

For  this  G,  we  always  have  2 G(Z)  >  y(Z)  (usually  G(Z)>y{Z)  except  in  some 
rare  situations).  Therefore,  with  the  above  G,  we  have 
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(3).  Error  bound  of  RH(X).  There  exist  constants  gj  for  the  RH  and  con¬ 
stants  j- 1,2,...  such  that  for  any  X=X^k\ 

|ef|  =ss<fc-l>. 

Proof.  The  proof  is  similar  to  the  one  in  §4.4.  • 

If  one  assumes  C(Z)>y,  then  when  ff=2~24,  the  values  of  gt  would  be  the  same 
as  those  in  Table  (4.4.4). 


Remark  1.  Let  p=^(X)/  j  A^Z)!  and  call  it  the  difficulty  measure  of  Ak(Z). 
The  relative  error  of  Ajk(Z)  is  then  equal  to  psk.  The  question  (whether  we 
can  compute  A k(Z)  with  high  relative  accuracy)  thus  becomes  whether  p  is 
close  to  its  lower  bound  1.  In  the  next  section  we  will  give  an  upper  bound  on 
p  when  the  imaginary  parts  Y  are  close  to  zero  as  shown  in  the  next  section. 
The  upper  bound  is  «.  in  the  general  case. 

Remark  2.  In  the  implementation  of  the  hybrid  methods  one  can  avoid  using 
RH  in  the  real  case  because  one  can  always  order  the  data  so  as  to  be 
nested.  In  the  complex  case  there  may  not  exist  such  an  ordering  and  RH 
seems  unavoidable  in  order  to  secure  good  relative  accuracy  in  the  most 
general  case  (e.g.  500  points  on  a  circle  of  radius  500  in  the  complex  plane). 
However,  in  section  5.3  we  will  show  how  to  salvage  SH  when  the  data  are 
complex. 
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5.2.  Computational  Difficulty 
Let  X,  Y  and  Z  be  as  defined  in  5. 1. 

Mean  value  representation  for  A^Z).  There  exist  real  y.,v  with 

min  rt,  £  a,v  £  max  m.  such  that 
j*l*j  Me  ]*t*J  m> 

A*(Z)  =Af(X)(cos  (jx)  +  isin(v)).  (5.2.1) 

Proof.  From  the  identity  exp($+i7i)---exp(£)(cQs(r))+isin(r}))  and  the 
Hermi:e-Gennochi  (1.3.1)  again,  we  have 

i'i  *'*-1 

0  0  0 
t  "i  *k-i 

0  9  o 

Since  exp  is  positive  on  real  values,  equation  (5.2.1)  follows  from  the  integral 
mean  value  theorem.  ■ 


(5.2.  l)  gives  a  lower  bound  for  |  A/(Z)  i : 

Lower  bound  of  Aj'(Z) 

|  A*(Z)|  i  cos(v)if(X)  >  0.-  (5.2.2) 

In  general,  there  is  no  positive  lower  bound,  as  can  be  seen  from  the  example 

Af([0,2«i])»  =  0. 

&i(Z)  can  be  computed  accurately  if  the  difficulty  p  (cf.  remark  1  in 

§5.1)  is  close  to  1  (notice  that  &f(X)  declines  like  1  /k\  ,  cf.  (1.3.2)).  When 

p  »  1,  it  is  difficult  to  obtain  Af(Z)  with  high  relative  accuracy.  This 

difficulty  is  intrinsic  with  our  methods  of  computation.  One  way  to  overcome 

*  We  conjecture  that  if  the  imaginary  parts  of  the  data  are  restricted  in  (0,2w),  then  all  di¬ 
vided  differences  if  will  never  be  zero.  It  can  be  proved  tor  «:  =  !;  but  we  do  not  know  any  proof 
when*  bigger  than  1. 
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it  is  to  increase  the  precision  of  the  arithmetic  operations  and  the  variables. 
Another  possible  approach  is  to  find  special  formulae  which  build  up  A*(Z) 
from  even  tinier  quantities,  e.g.,  FDD,  the  first  order  divided  difference  for 
exp  in  §2.2,  Unfortunately  for  n> 2  we  do  not  know  if  any  such  formulae  exist. 

We  define  p=p(Z)=Lf~i(X) /|A”-1(Z)|  and  call  it  the  difficulty  of  Z  (for 
exp).  The  bigger  the  value  of p  the  more  difficult  it  is  to  compute  iii(Z)  accu¬ 
rately.  From  (5.2.2),  Z  is  not  difficult  when  |Imag(fi)|^0.45rr,  i  =  l . n, 

for  then  p  ^  6.41.  Examples  of  difficult  Z  are  :  those  abscissae  close  to 

[0.2rri.4Tri . 2A:rri]  (for  all  divided  differences  on  this  abscissae  are  equal 

to  zero,  i.e.,  p=°°).  Another  example  is  Z=[0,i,  2.04254+7. 97730i].  We  com¬ 
puted  A (Z)  with  approximately  7  decimal  precision  and  give  the  correspond¬ 
ing  p  in  the  following  table.  Notice  that  SS  lost  6  digits  in  the  last  divided 
difference,  which  has  difficulty  p  «  IQ6. 


Correct  values 
to  3  digits 

SS 

_ 

P 

Ai_ 

■iMOBEESSHii 

(  0.841  0.460  ) 

1.05 

lAfi 

(-0. 144e-06  -0. 73te-07) 

(-0. 161e-06  -0  800e-07) 

7. 14e06 

Table  5.2.3.  Divided  differences  on  Z=[0,i,  2.04254+7. 97730i]. 


Remark  1.  In  the  application  of  matrix  exponential,  the  need  for  high  rela¬ 
tive  accuracy  in  A -f(Z)  decreases  with  |A *(Z)\.  When  it  is  satisfactory  to  com¬ 
pare  the  error  in  |  Af(Z)|  with  then  the  difficulty  evaporates. 


Remark  2.  In  general  the  difficulty  of  Z  increases  with  the  spread  of  the 
imaginary  parts.  For  example, 


Ai1([Re(Ci),Re(<-a)]) 

|A/([fi.fr])| 


kl-fgl 

|Re«Wz)l  '  |.W-.W| 
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|Re«Wa)l  ' 

So  the  bigger  the  difference  of  the  imaginary  parts  the  larger  is  the 
difficulty.  As  a  point  of  interest  we  also  compute  the  difficulty  on  circles  with 
various  radii  and  number  of  points.  The  results  are  summarized  in  the  follow¬ 
ing  table.  Each  entry  is  the  difficulty  of  abscissae  distributed  uniformly  on  a 
circle  with  radius  7. 


7=5 

7=10 

r  7=15  _ 

7=20 

7=25 

n  =5 

2.2 

3.1 

3.2 

3.2 

3.2 

n=10 

1.7 

7.9 

28.9 

35.9 

45.1 

n=15 

1.5 

4.5 

25.0 

173.6 

301.7 

n=20 

1.3 

3.2 

12.6 

77.5 

651.9 

Table  5.2.4.  Difficulty  of  the  liighest  divided  difference  on  circle. 


5.3.  Ordering  and  Matrix  Argument  Reduction 

A  nested  ordering  may  not  exist  for  general  complex  data  Z.  However,  if  the 
imaginary  parts  of  the  data  are  bounded  by  a  small  number,  then  one  can 
order  the  data  according  to  their  real  parts  and  get  an  almost  nested  order¬ 
ing.  In  this  section  which  is  based  on  the  period  2 rri  of  exp,  we  indicate  a  way 
to  transform  the  data  to  values  that  have  bounded  imaginary  parts. 

Definition  of  The  Reduction  Function  Mod(A) 

Since  exp  has  period  2m  the  strip  —  77<Imag(f)^rr  is  representative.  Let  us 
define  the  argument  reduction  function  for  exp  as  follow: 

Mod(f)af-21:rri  if  (2A:  —  l)7ri  <  Imag(^)  £  {2k  +  l)rri. 

We  have  exp(f)=exp(Mod(f)).  Now  we  are  going  to  extend  the  function  Mod  to 
matrices.  Let  J  be  the  Jordan  normal  form  of  A,  i.e.  A-P~XJP,  and 
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y=diag(/ii . Jit)  where  Jm  is  the  Jordan  block  with  diagonal  equal  to 

eigenvalue  Xm  of  A.  Let  km  be  the  integer  such  that 

(2k  —  1 ) 7TT  <  Imag{Xm)  <(2k  +  l)m. 


Define 

(1)  Mod(Jm)sJm-2kmm/; 

(2)  Mod(J)sdiag(Mod(/il) . Mod^)); 

(3)  Mod(A)sP~,Mod(/)P. 

It  is  not  difficult  to  prove  that  exp(A)=exp(Mod(A))  according  to  (l).  (2) 
and  (3).  Thus  Mod  generalizes  argument  reduction  to  matrices  and  yields  a 
matrix  that  has  eigenvalues  with  bounded  imaginary  parts. 

As  we  have  mentioned  in  the  introduction,  the  application  behind  the 
computation  of  Aj'exp  is  matrix  exponentials.  If  one  applies  the  matrix  argu¬ 
ment  reduction  before  computing  the  exponential,  then  all  the  eigenvalues  of 
the  matrix  would  have  bounded  imaginary  parts,  thus  solving  the  ordering 
problem  in  the  computation  of  the  divided  differences. 


Remark  1.  There  is  another  way  to  reduce  the  imaginary  parts  of  the  data: 
since  Aexp=exp(Z'n),  we  may  apply  argument  reduction  directly  on  Zn  and 
compute  exp(Mod(Zn)).  However,  the  bidiagonal  structure  of  Zn  will  be  des¬ 
troyed  by  the  reduction  and  therefore  some  modifications  of  the  algorithm 
TS  are  needed.  The  work  for  the  whole  computation  increases  significantly. 


Remark  2.  For  the  computation  of  Mod(A),  there  is  a  stable  method  which 
avoids  using  the  Jordan  decomposition  of  a  matrix.  When  A  is  triangular  the 
work  needed  is  approximately  n3/  3  operations  which  is  quite  practical.  We 
will  give  an  algorithm  for  argument  reduction  in  another  paper. 


43 


section  5 


5.4.  Conclusion:  SH  for  data  with  restricted  imaginary  parts 

Although  RH  gives  the  divided  differences  with  guaranteed  accuracy,  it  is 
impractical  to  implement  it  unless  the  order  of  the  divided  differences  is 
very  small  like  3  or  4,  because  the  number  of  operations  grows  like  2n.  Sec¬ 
tion  5.3  shows  that  (assuming  one  has  the  matrix  function  Mod(A))  one  can 
consider  matrices  with  eigenvalues  close  to  the  real  line,  so  there  is  no  loss 
of  generality  in  considering  Z  with  imaginary  parts  bounded  by  rr.  There  are 
two  advantages  to  small  imaginary  parts.  The  first  is  that  we  can  order  the 
abscissae  according  to  their  real  parts  and  obtain  an  almost  nested  ordering 
(according  to  the  G  defined  in  section  5.1).  Thus  one  can  apply  SH  (Simple 
Hybric.  method)  instead  of  RH  (Recursive  Hybrid  function).  The  second  is 
that  the  backfilling  step  in  SS  is  stable,  which  implies  that  one  can  replace 
SS(I1)  by  SS  with  only  slight  sacrifice  on  accuracy.  But  the  trade  off  is 
significant,  since  SS  takes  0 (n2)  operations  and  requires  only  a  few  vectors 
for  storage  while  SS(II)  take  ^(ra3)  and  requires  a  matrix  storage.  We  con¬ 
clude  his  section  by  proposing  the  following  . 

Computation  of  A(Z).  Given  Z  with  R e(Z)  in  increasing  order  and 
|Imag(Z)|  s:rr.  Use  algorithm  SH  with  the  following  G  to  compute  A(Z). 

Decision  Function  G  for  SH  on  Z.  The  function  G  on  Z=[('1 . £n]  is 

defined  *,n  be  G(Z)=(('n,Ci)1‘,  and  the  decision  is,  for  z<j, 

ti.fj  belong  to  the  same  cluster  if  Re(£,  -<\ )<£,_* 
where  the  values  of  gt  1=1,2,...  can  be  those  in  (4.4.9).  • 


Numerical  Results.  We  ran  the  SH  algorithm  on  Z  that  has  the  same  real 
parts  as  in  (4.4.11)  but  with  the  imaginary  parts  =  ±ir.  The  results  are 
t  For  such  Z  and  G,  one  can  show  that  2(C(Z)+tt)  >  y (Z). 
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summarized  in  the  following  table. 


n 

<n 

Correct  values 
to  7  diflits 

SH 

dig’.ts 

lost 

l 

-27.0+rri 

(-0.1879529d-ll  -0. 1643136d-18) 

(-0. 1879529e-l  1  -0. 16431 36e- 18) 

0. 

2 

-26.0- rri 

(-0.7978484d- 13  -0.5013023d-12) 

(-0.79784 83e-13  -0.5013023e-12) 

0. 

3 

-15.0+rri 

(-0.1 747305d-08  0.9981036d-09) 

(-0. 1747305e-08  0.998l036e-09) 

0. 

4 

-14.0-  rri 

(0.4155l01d-09  -0.4757347d-09) 

(0.41 551 02e-09  -0.4757347e-09) 

0. 

s 

-12.0+ rri 

(0. 1814275d-09  0,1323147d-08) 

(0. 18l4274e-09  0.1323147e-08) 

0.43 

6 

-10-rri 

(0.759l439d-09  -0.5204460d-09) 

(0.759 1440e-09  -0.5204460e-09) 

0.32 

7 

-8.0+rri 

(0.4204520d-09  0.51 19519d-09) 

(0.4204520e-09  0.51 19519e-09) 

0. 

8 

-7.9- rri 

(0.209l884d-09  -0,3885012d-10) 

(0. 2091884 e-09  -0.3885013e-10) 

0.20 

9 

-7.8+tti 

(0.4721783d-10  0.2416627d-10) 

(0.4721784e-10  0.2416627e-10) 

0.48 

10 

-2.7-rri 

(0.2229288d-10  -0.12S5209d-10) 

(0.2229289e-10  -0.1255209e-10) 

0.90 

11 

1.0+ rri 

(0.1 147709d-10  0.8703504d-l  1) 

(0. 1 147709e-10  0.8703510e-U) 

0.99 

12 

l.i-rri 

(0.3820952d-ll  -0.7453l52d-12) 

(0.3820956e-l  1  -0.7453158e-12) 

1.18 

13 

l.2+rri 

(0.7360573d-12  0.2693091d-12) 

(0.7360580e-12  0.2693094e-12) 

1.25 

14 

i.3-rri 

(0.1204098d-12  -0.1441720d-13) 

(0. 1 204099e- 1 2  -0. 1 44 1 722e- 1 3) 

1.29 

15 

3.0+rri 

(0.1798853d- 13  0.5672050d-14) 

(0. 1798856e-13  0.5672058e-14) 

1.42 

16 

7.0- rri 

(0.3734263d- 14  -0.8906243d-15) 

(0.3734269e-14  -0.8906257e-15) 

1.43 

17 

9.0+rri 

(0. 7041432d-15  0.2104887d-15) 

(0.7041446e-15  0.2l04891e-15) 

1.51 

18 

13.0-  rri 

(0.1 70549 Id- 15  -0.5255464d-16) 

(0. 1705495e-15  -0.5255475e-16) 

1.51 

19 

24.0+rri 

(0.1783471d-15  0.22438 19d- 15) 

(0. 1 78347 le- 15  0. 22438 19e- 15) 

0.27 

20 

25.0- rri 

(0.954005 Id- 16  -0.1855942d-16) 

(0.9540053e-16  -0.1855942e-16) 

0.60 

Table  (5.4.  l). 


Test  example  for  SH  on  complex  data. 
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6.  Application  to  Computing  Matrix  Exponentials 

6.1.  Representation  of  f(A)  by  the  Newton  interpolating  polynomial 

Let  A  be  nxn  and  let  /  be  any  scalar  function  with  at  least  n  continuous 

derivatives  at  the  eigenvalues  f  t . £n  of  A.  Associated  with  /  is  the  unique 

polynomial  of  degree  n— 1  which  interpolates  /  at  the  ft.  A  convenient 
representation  of  this  polynomial  was  given  by  Newton, 

Pn-l(t  )  =  /  (ft)  +  ’gif/  ft  (t  -ft  )  . 

feat  i* 1 

Here  A*/  denotes  the  k-th  order  divided  difference  of  /  at  the  abscissae 
ft . ft+i  • 

A  fundamental  result  in  matrix  theory  (see  [2])  is  that 

fM=Pn-M)  .  (6.1.1) 

That  is, 

Newton  interpolating  polynomial  of  f  (A). 

fU)  =  A?/-/  +’g,A,k/ftU-ft/).  (6.1.2) 

*al  ;=1 

In  our  applications,  A  is  in  triangular  form.  Therefore  the  eigenvalues 
are  just  the  diagonal  elements  of  the  matrix  and  the  matrix  products  can  be 
formed  efficiently. 


6.2.  Matrix  exponentials 

Let  A  be  triangular.  Since  exp  is  periodic  on  the  imaginary  axis  with  period 
2rr,  we  can  use  argument  reduction  in  matrix  form  (cf.  section  5.4)  on  A, 
replace  A  by  another  triangular  A  such  that  exp(A)=exp(/l’)  and 
|  Imag(a’i  i)  |srr.  There  is  no  loss  of  generality  in  assuming  that  argument 
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reduction  has  been  done  and  therefore  the  imaginary  parts  of  the  eigen¬ 
values  of  A  are  bounded.  Now  we  can  apply  SH  on  the  eigenvalues  to  obtain 
the  divided  differences  and  compute  exp  (A)  by  (6.1.2). 
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Appendix  A 

Algorithm  &  Error  Analysis  for  computing  the  first  row 
of  exp  (Zn)  by  Taylor  series 


(I).  Algorithm 


For  later  reference,  we  consider  here  a  more  general  matrix  Zn.  Let 
d(l),--\d(n)  be  the  first  row  of  F=exp(Zm)=I +Zn  +  Z£/ 21+---,  where 


Zn  = 


fl  Q1 
Kz  a-2 


On- 1 

<n 


Let  Pq  =  /  and  Pk  =  Z£  /k !  for  fca:l.  An  obvious  way  to  compute  F  is  : 


fOi  Set  F-I .  k  =  1  and  P0-I. 

1 1)  If  F  has  converged,  stop. 

(2)  Evaluate  Pk  =  Pk^lZn  /fc. 

(3)  Update  F  and  k  :  k  *-  fc  +  l  and  F  *-  F+Pk,  go  back  to  (l). 
Here  step  (2)  implies  at  fc-th  loop, 


^(U)=p[fi-PJt_1(l,i)+ai.l  Pfc_1(l.i-l)]  .  i=2,3, 
where  Pk(i.j)  denotes  the  i,j- th  element  of  matrix  Pk. 


(A.1) 


Notice  also  that 


F*(l,i)=0  when  fc*si-2  and  _Pi_i(l.i)s= 


ft 

_  J=i 


(i-1)! 


for  2^i^n,  for  Z*  is  a  band 


matrix 
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Zk  = 


ff  •  ■  ft«, 

>= i 

k*\ 

-fc  •  •  Tli 

;=2 


ft  «, 


J=n-k  *-l 


with  bandwidth  fc +1.  Hence,  we  have 


d(i)=  2p*(i.i)  =  E  flk(u) 

*iO  jtit  - 1 

=  E  flk'+i- i(l.i)  .  l=si^rt 

fc'iO 


Set  sJt(i)=F*>i_1(lIi).  we  have 


d(i>=,5,S*(i)  (A.2) 

and  from  eq.(A.l), 


(a) 

(b) 

(c) 


ft* 


_  /=! 


Sjt(l)sFJk(l.l)=f, 


for 

Pk-  i(l.l) 


=  1)  for  fc>0, 


s*(i)aF*>i.1(l.i) 

=  lfcTi-l  ^  s*-»(i)+ai-i's*(i~1)] 


(A.3) 


fori=2,3 . n  andfc  =  l,2,.. 


Equations  (A.3)  suggest  the  algorithm  TS  in  section  2.2  for  cf  (l),...,d(n), 
where  aj  =  l  for  j =1,2,. ...n.  TS(II)  is  just  a  simple  generalization.  ■ 
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(II).  Error  analysis 

Here  we  develop  error  bounds  on  the  computation  of  d(i)  via  equations  (A.2) 
and  (A.3)  (with  Oj-l.  j  =  For  simplicity  |eij^£(unit  roundoff) 

will  denote  the  rounding  errors  Introduced  by  basic  arithmetic  operation 
(+.-.*./).  e.g.  /Z((a+6)“c)=(l+e1)(l+e2)(a+6 )*c  etc..  Define  the  absolute 
error  ek (i)  in  sk(i)  by 


Bfc(i)  3  fi(sk(i))  -  sk(i)  .  (A.4) 

From  (A.2),  af(i)  is  computed  via  the  truncated  series  ^sfc(i)  for  some  l. 

k—0 

Therefore  for  l^i^n  and  e0»0. 

/Z(d(i))-d(i)=/Z(£/Z{sk(i)))  -  £sk(i) 

k~0  Jk=0 

=/i(sc(i))I<I(l+ei)  +  ifi(sk(i))-il(l-i-Sj)-Zsk(i) 

j=l  k* 1  j-k  k= 0 

=(so(i)+««(i))iI(l+e/)  +  £  (s^O-H^COyfen-e^-E  s*^' 

/»1  >=fc  *=o 

That  is. 


/Z(c£(i))-d(i)=-  J  sk(i)+£sk(i)[  rt  (1 +«>)-!]+ ie*(i)-  (l+Sj). 

i=i  +  l  *»0  /smax(l^)  k  -0  / =raax  (ljbj 

■/  +  //+///. 


In  order  to  bound  /.  II  and  III,  we  need  the  following  two  lemmas.  Let 
7= max  |  Ci  I  • 

Lemma  1.  For  liiin,  |sfc(i)  |s^-s0(i).» 

Lemma  2.  If  12(Z-t-7i-l)8£<l,  then 

|ek(i)!s3(*+i-l)<r-  s0(i).» 
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We  postpone  the  proofs  of  the  lemma  1  and  2.  As  a  consequence  of  the  lem¬ 
mas,  we  have  the  following  bounds: 

Error  bound  on  d(i).  Pick  l  large  enough  so  that  |/|  is  less  than  e.  For 
example,  l  -  1  +  max  (  [67]  ,  [~log2e]  )  would  be  sufficient.  Here  [x]  is  the 
greatest  integer  <.x.  Assume  the  condition  in  lemma  2  holds,  and  also 
assume  3(7+71— 1)1  e  <  1.  then 

|/i(cf(i))-d(i)]^[3(i+7)+i]£eV(i-l)!.  (A.5) 

Proof.  Since  &!>-'/2rr£ ’(— )k, 

e 

Moreover  l  was  chosen  so  that  l>  67  and  2~l  <  e,  so  |/|  <  sey.  For  //. 

notice  that  [  (l  +  r)-l]  <  (Z  -k  +  l)e  <  (Z  +  l)r,  so 

flj:} 

|//t '  *  (£  ^-)(i  +  l)  £  S0(i)  <  (l  +  l)Ee?s0(i). 

t=o  K- 

Finally,  apply  lemma  2,  III  is  bounded  by 

|//7|sS  3(k  +i— l)e-  ?-[—  ■So(^)'(l‘*"^e) 

<[3  £  +  3(i— l)  §  •  ^-](l+i£r)-sc(i) 

Sl[3(7+i-l)+3f  (7+i-l)£]E’e?-s0(i). 

Therefore,  by  the  assumption  3(7+71 -l)Ze  <  2, 

|/|-+|//!+|///|  ^  [l+37+3i -3+Z  +  l+3(7+i— l)Zc]ee7s0(i). 

<  [3(7+i)+Z]s  e7s0(i).» 
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Our  bound  (A.5)  suggests  that  if  d(i)aAJ~1exp  «  e7/(i— 1)!  then  TS  will 
not  yield  a  small  relative  error.  For  later  reference,  we  derive  the  following 
corollary.  Let  XsRe(Z)  (cf.  appendix  B  for  notations),  we  have 

Corollary.  It  Aexp  is  computed  by  TS.  then  to  first  order  in  e 


/l(A{-lCZ)exp)-Aj~l(Z)exp 


Af'Wexp 

Proof.  Since  X  is  real,  (1.4.1)  implies 
sc(i)s  ji~~.  equation  (A.5)  yields  (A.S).  ■ 


|  <[2(j.+y)¥l]s-e27. 
Ai~l(x)exj?  Sr  ^£y;- 


(A-6) 

Since 


Remark.  The  bound  (A.5)  is  pessimistic,  especially  when  y  is  smalL  For 
example,  when  7<1.  the  error  ek(i)  in  sk(i )  would  decrease  rapidly  with  k 
and  therefore  be  far  smaller  than  the  rounding  error  ;nade  in  aiiding  sk(i )  to 
d(i).  In  general,  II  is  much  larger  than  I  and  III  and  vre  always  have 

fl(d(i))  -  d(i )  »  //  =  £s*(i)[  ft  (1+6;)  — l] 

k«0  jatn-ut.  [ljej 

a  £$*(*)(«* +s,t*i+”+e{)- 

k=0 

*0-0 

Since  1  £i+£2+‘"4Ej  I  grows  like  VJ  e,  it  is  conceivable  that  the  absolute 
error  of  the  divided  differences  will  be  bounded  in  magnitude  from  e/(i— 1)1 
to  VT-e7/ (i— 1)!.  We  therefore  have 

An  estimated  bound  of  the  error  in  d(i). 

ess  |/Z(d(i))-d(i)|  (i-l)»  £  VT-ee7.-  (A.7) 

Various  numerical  results  confirm  that  both  the  upper  and  lower  bounds  in 
(A.7)  are  reasonable.  Examples  can  be  constructed  so  that  the  errors  grow 
with  y  and  are  about  .01  to  .1  of  the  upper  bound. 
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Proof  of  lemma  1. 

Since  sc(i)sl/  (i-l)!  and  sJ.(l)=f1*:/fc! ,  lemma  1  is  true  for  k  =0,i  =  1,2,... 
and  i  =  l,h  =  1,2,....  Assume  that  the  lemma  is  true  for  sk(i— 1)  and  sfc_j(i). 
then  according  to  (c)  of  (A.3)  and  So(i— l)=(i-l)sc(i), 

!■ s*(i> It50*4-1'] 

£s'  irSoii)- 

By  induction  oni+A:,  i>  1,  fc> 0,  lemma  1  holds  for  alli,A:.  • 

The  bound  above  is  best  possible. 


Prao f  of  lemma  2 

First,  let  us  establish  the  recurrence  relation  among  ek{i)’s  according 
to  equations  (A.3).  Consider  (a)  of  (A.3),  if  s0(i)=  -  is  computed  by 

s0(l)=l,  sc(i)=sc(i-l)/(i—l)  for  i>l,  then 


e0(i ) 5/Z (s0(i ))-s0(i)=fl (  -°f  ) -s0(i~l) 

,/f(s0(i-l)) 

=(1+^i) - jzi - so(*) 


=e,s0(i)+(l+5i) 


e0(i-l) 
i— 1 


Hence 


(a)'.  e0(l)=0. 


\eQ(i)\tze-s0(i)+(l+e) -  fori>l. 


Next  we  consider  equation  (b)  in  (A.3).  We  have 


t 
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e*(l)s/£(s*(l))-sJfe(l)=/i(~  SA:-,(l))  -s*(D 

=  (l+Sl)(1  +  ff2)^"C5*!-l(1)+efe-l(1)]  “  sk(l) 

=[(l+ei)(l+S2)-l]sfc(l)  +  (1+*lX1+*2)  • 

By  lemma  1,  we  get 

(b) '.  |eh(l)|sS[(l-t*e)8-l]|7-So(i)+(l+e)2J-|eii_1(l)|  forfclsl. 

Finally.  (c)  of  (A.3)  implies 

e*(i)5a/i(s*(i))-s*(i)=/2(-^|::^-[f<  s<,..1(i)+sib(i-l)])  -sfc(i) 

» - »»(i) 

=  <»%&«>  i)  +  «*.-.(*>]  -*,W  + 

Therefore,  for  iM.fcil, 

(c) -.  |.t«)|*[(H-«)»-l  ]&•«»(*>+ 

Now  we  prove  lemma  2.  It  is  not  difficult  to  show  by  induction  that 

(i) .  equation  (a)' implies  je0(i)|s:[(l+e)<_,-l]so(i)  and 

(ii) .  equation  (b)’ implies  |e*(l)|s[(l+5)8fc-l]^j--s0(l). 


So  lemma  2  holds  for  A=0,iil  and  i  =  l,£fcl.  Let  fcil  and  i>l.  Assume 
lemma  2  holds  for  e*(i- 1)  and  eto_i(i).  From  (c)\ 


54 


Appendix  A 


|«*(i)|s(2S+4e*)^-s0(i)  +  j^^-[7|eJb_l(i)|  +  |e*(<-l)|] 

<2+r+i-\*AE  +3(i+4ff)(*+i-2)> 

=[3(fc+i-l)+12(fc+'i-l)e  -  ]e^-s0(i). 

Therefore,  when  12(1 +7i-l)*e25l, 

|e*(i)  |^3(A:  +i  -l)e  -sa(i). 


By  induction  oni+fc,  i>l,A:&i,  lemma  2  holds  for  all  i.fc. 
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Appendix  B 

The  error  analysis  of  SS  and  SR 


Recall  the  definition  of  the  following  the  notations  (cf.  section  5).  X  and  Y 
denote  the  real  and  imaginary  parts  of  Z.  Le.,  Z-X-HY  (recall  .Af =[••&■•]. 

and  r=[--T7i-])-  pseudo  relative  error  in  Af(Z)  is 

/l(A?(Z))-A*(Z) 


*<k(Z)> 


±i(X) 


We  will  suppress  Z  in  s?(Z)  if  there  no  risk  of  con¬ 


fusion.  tj  denotes  the  roundoff  error  introduce  by  basic  arithmetic  opera¬ 
tions  (whether  complex  or  real),  e.g.  fl(a(b  +c))=(l+Ej)(l+£2)a(6  +c)  etc.. 
We  always  have  |  Cj  |^r. 


(I).  SR  (Standard  Recurrence):  error  growth  of  one  step  SR. 

If  &?(Z)  is  computed  by  — - — - - - ,  and  if  Re(^iA.fc)s:Re(fJ)  for 

-  Ci 

then,  provided  that 

4-[i+  -r—i ~-r]-max(  |  | ,  |  |  J+3 e+*-  <  1 

the  pseudo  relative  error  Ci-Si(Z)  satisfies  : 


|  e/*|s4c+[l+  |  ft |  3~max^  I  e*+il I  •  I  *i~l  I  J-  (B.  1) 


Proof.  We  have 


/l(tf(Z))=/i 


fi(*M(z))-/W-'(z)) 
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EtVl’-E*'1 


=  (l+£1)(l+e2)(l+e3)  [A*(Z)+etV11  triX)  +  Af-‘(A-)-  ,  , 

Ct>fc  Ci  Ci 

=  (l+e1)(l+ea)(l+ea)-[Af(Z)+efcil-A«^)- +  tf~x(X)  *'*f ]■ 


Ci+fc“Ct 


Ci+*  Ci 


Therefore 


|/f(^(Z))-A*(Z)|<(l  +  £)3-l]jA^)|  + 

+  (l+e)3[AfCr)+  |  [  Atfc~‘(A')]  max{  j  efo1  1 . 1  e? I  { 

Af~l(X) 

Since  by  (4.1.8)  ~r-  <.k  and  by  (5.1.2)  |  &?(Z)  \  *Sbf(X),  the  bound  follows 

Ai  (a) 

by  a  direct  estimate  on  the  above  equation.  ■ 


Remark.  The  requirement  Re(fil.fc)&Re(fi)  for  it£j<i+k  is  necessary  only 
when  the  imaginary  parts  of  the  data  are  close  together.  It  is  used  to  show 


itrf(X)-*?-l{X) 
CiM:  “Ci 


*  \HX). 


(*) 


When  the  imaginary  parts  are  not  clustered  together,  one  can  replace  the 

condition  by  a  more  general  one.  Let  xsdiamfl^ . Ci+tl)  and 

2  sdiam([^j . Ct+*])-  then  with  (4.1.6)  one  can  show  that 

1  xz 

ra  *-* <k 


implies  the  inequality  (*). 


(II).  SS  (Scaling  and  Squaring):  error  bound  of  SS 

We  only  do  the  analysis  of  SS(1I).  Recall  7=7(2)  Hmax)ft -77)  where 
f?=(£]C t)/n-  1°  SS(II)  the  number  of  squaring  K  is  chosen  such  that 

t 
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2 ~Ky  ^  0.7.  Let  L  denote  the  number  of  terms  need  in  TS.  Provided  that 
[(3-e  lA+K—l)k  +  2x~i(2+e  iA(2.  l+l))— 2]zc  ^  0.5  .  we  have 

Error  bound  of  SS(II) 

|e*|  s[(3-8I-4+A>+2A:'(2+e1-4(2.1+i))-2]'£  (B.2) 

Proof.  There  are  two  sources  of  error  in  SS(II):  the  initial  error  in  TS(1I) 
(Taylor  Series  algorithm  (11))  and  the  error  introduced  by  squarings.  Since 
2~x~/£0.7,  we  have  from  equation  (A.  6) 

|  z*(2~KZ)  ( <3*  +(2. 1+i  )]se  '■*.  (B.3) 

(B.3)  can  be  written  as  \  s^\^(C[0)k +Ci<3',)s.  where  C[0)=3  e14  and 
C^=(2.1+E)  eu.  Suppose  |  e*’, ^.(C^k  +  Ca)e,  we  now  investigate  how  much 
the  error  would  change  after  one  squaring.  According  to  method  SS  (cf.  2.4). 
we  compute  <i(2Z')  in  modified  step  3  for  some  Z ’  by  R±(Z')2R~l.  Compare 
the  (i,i+fc]-th  elements  of  both  sides  we  get 

Af(2 Z')  =  2*-  itf(Z')  ±tf(Z')  .  (B.4) 

J-Q 

Thus,  if  multiplication  by  2fc  can  be  computed  exactly,  we  have 
| fl(Af(2Z‘))-tt(.2Z’)\  as  |/|  +  |//|, where 

I  •  2*  f fl(i*1(Z')  *W(Z') )  -  ifl(*KZ’))  fl(i>&(Z')) 

[  f«0  ;«o 

II  =  2*  f £/i(A/(Z’))  /f(A^(Z’))  -  i±KZ')  htf(Z') 

\iB  o  /*  o 

From  \±!(Z’)\£*i(X')  and  |/Z(A/(Z’)))|  =  i ^(.Z')^i(Z')  ^(X')\M^tiMX'l 
we  have,  following  some  standard  error  analysis,  if 


then 


(Ctk  +  C8)(2+ c[  Cjfc  +  C2])e  <0.5, 
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| / 1  s:2*e  £](*-; +  1)(  1+1  iRZ) | )( 1  + 1 cjjri(Z') | )At(X')-Afrf(Jr) 

j=o 

£  (k  +1.5)e-2fc  ■  t  hXX'y^-KX') 

j=o 

<;  (k+ 1.5)e-Af(2X*)- 

Similarly,  when  (Cik  +  C2)2  c<0.5,  we  have 

|//|*2*-£  I cRD  S/m-AtfCZ')  +  e/^(Z-)  A/(2')  Aj*^(^)  +  ^{Z')  z^RZ’)  SH.X-)  ^{X’) 
i-  o' 

*  [Ctj + c2+c,(jfe  -/)>  c«+(Cik +c2)8]e-2i  £  Mwy&Rx') 

y»o 

s;  (C1fc+2C2+0.5)e  A^(2A’'). 


Thus, 


c/(2Z') 


|/l(Atfc(2Zl))-Aje(2Z,)l 

A*(2X) 


[(C,,+l)fc+2Cz+2]£. 


So.  if  we  update  C'^*-C’(°*  +  l  and  CP*-2CP  +2  ,  we  have 

eRZ~^K~^Z)  jC  [C(^A:  +  C^^]e.  It  is  now  obvious  that  we  can  repeat  the  above 
argument  and  obtain,  after  K  squaring,  the  following  bound: 

e/(Z)  s:  [C't^h  ]e.  (B.5) 

where 


C[W  =  C[0)  +  K  and 
Cp  =  2if(Ci0)  +  2)  -  2 

Since  the  assumption  before  (B.2)  is  (C{K~^k  +  C&K~l) )*■  e< 0.5,  which  obviously 
implies 


(C^,Jfc+C^)(2+c[Cp,/fc  +C^',])c  <  0.5  and 
(C^Jk+CP)2  £  0.5  for  any  0*j*zK-l. 
Thus  justifying  (B.5).  and  (B.2)  follows  (B.5).  • 
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Appendix  B 


Corollary. 

|  Ci  |  s[ Crfc  +  Cyk  ln(max(7.0.?))  +  C2‘  max(7.0.7)-2]- 1  (B.6) 

where 

C0=3-el-4-log2(0.35)  »  13.63 
Ci  =  l/ln(2)  sa  1.4427 

C2=(2+e1-4(2.1+Z))/0.35.  where  Lr  is  chosen  such  that  £(0.7)1/  j!  «g  e. 

Proof.  When  7^0.7,  (B.6)  follows  from  (A.6).  When  7>0.7,  we  have 
0.7fe2"*7>0.35,  hence 

2A'  <7/0.35 
K  <  !og27  -  logy (0. 35)  • 

Equation  (B.6)  follows  from  (B.2).  • 


Remark.  Bound  (B.6)  is  quite  pessimistic.  We  observe  that  the  actual  error  is 
approximately  0.01  times  what  (3.6)  gives.  The  following  are  some  numerical 
examples.  Consider  Z  that  has  n  abscissae  distributed  uniformly  on  the  reed 
axis  from  -7  to  7.  We  ran  our  SS  algorithm  on  a  Vax  with  e=2"Z4  on  this  exam¬ 
ple  •with  various  n  and  7.  The  results  are  summarized  in  the  following  table: 
the  entries  under  (B.6)  are  the  bounds  (as  a  multiple  of  e)  obtained  from 
equation  (B.6)  and  those  under  SS  is  the  maximum  magnitude  of  the  error  in 
the  divided  differences  computed  by  SS  (as  a  multiple  of  e). 


*  In  particular  when  i  =9  and  Ibis  (cf.  §2.3). 
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